Short notes on namedtuples, NamedTuples and Data classes

Both dataclasses and tuples are based on the attrs project, the one Python Library everybody needs and are fast object types (factory methods) designed to simplify and reduce code.

namedtuple and NamedTuple

  • Both immutable
  • Tuple-based (hence fast) but far better than tuples
  • NamedTuple is the typed version of namedtuple
  • immutable, iterable, hashable, unpackable
  • backward-compatible with tuple (e.g you can access a namedtuple by index)
  • default arguments supported from Python 3.7+
  • fast! C-based
  • example 1
Point = namedtuple('Point', 'x y')
pt1 = Point(1.0, 5.0)
pt2 = Point(2.5, 1.5)

from math import sqrt
# use index referencing
line_length = sqrt((pt1[0]-pt2[0])**2 + (pt1[1]-pt2[1])**2)
 # use tuple unpacking
x1, y1 = pt1

Dataclasses

  • Are mutable
  • post-init processing can be used to create fields depending on other fields or even to perform input validation2
  • all implementation is written in Python, so slower wrt tuple-based data types
  • Validation of types at runtime not supported natively (or cumbersome) but easy with decorator @enforce_typing
  • from Python 3.7
  • are just regular Classes (e.g. inheritance) withot writing boilerplate code
  • inappropriate when API compatibility with tuples or dicts is requested
  • bonus PyCon talk, if you have time
  • example3
from dataclasses import dataclass

@dataclass(unsafe_hash=True)
class InventoryItem:
    '''Class for keeping track of an item in inventory.'''
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand

Other fancy data types (not from the stdlib)

  • pydantic: enforces type hints at runtime providing user-readable errors when data is invalid. Seems relatively popular.
# sample input val via Regexp
from dataclasses import dataclass
import re
@dataclass
class Widget:
    id: int
    def __post_init__(self):
        id_condition = re.match(r"[0-9]{4}", str(self.id))
        if not id_condition:
            print(f"{self.id} doesn't follow pattern [0-9]{{4}}")
            raise CustomException

  1. brief explanation on stackoverlow ↩︎

  2. (source↩︎

  3. source ↩︎