Python Review · Iterators, Generators, Comprehensions, Iterable, Hashable

Jan 30, 2016 | Tech Software

Iterators

An object that represents a stream of data. It fetches one element at a time using the __next__() method and maintains its internal state to know "what's next."

range
: generates a Sequence (not an Iterator actually) of numbers for loops.

enumerate
: pairs each element with its index counter.

zip
: combines multiple iterables element-wise into tuples.

map
: applies a specific function to every item in an iterable.

filter
: keeps only the items in an iterable that satisfy a condition.

reversed
: returns a reverse iterator that accesses elements from end to start without modifying the original sequence.

Iterator vs. Sequence
:

Function	Return Type	Reusable?	Indexing/Slicing?
`range()`	Sequence	Yes	Yes
`enumerate()`	Iterator	No	No
`zip()`	Iterator	No	No
`map()`	Iterator	No	No
`filter()`	Iterator	No	No
`reversed()`	Iterator	No	No

Note:

Sequence: you can loop through it multiple times, and it supports indexing and slicing, while being memory efficient.
Iterator: you can loop through it only once, and it's consumed and becomes empty; if you need the data again, you must convert it into a lasting type or recreate the object.

Generators

A special type of iterator that allows you to loop over data without storing the entire dataset in memory. It generates values on the fly (lazily) using yield, making it extremely memory-efficient for processing large files or streams.

yield
: makes a generator function.

Expressions
: make generator instances.

Iterators vs. Generators

特性	Iterator	Generator
定义	遵循迭代协议的对象。	基于函数的特殊迭代器。
实现方式	基于类（使用 `__iter__` 和 `__next__`）。	基于函数（使用 `yield` 关键字）。
状态管理	手动管理（通过类实例变量记录位置）。	自动管理（在 `yield` 处自动暂停/恢复）。
内存占用	低（惰性求值，按需生成数据）。	低（惰性求值，按需生成数据）。
复杂度	高（需要编写较多样板代码）。	低（语法简洁，可读性强）。
典型用途	自定义复杂对象或数据结构。	处理数据流、简单序列或流水线计算。

Comprehensions

A concise, readable syntax for creating new collections by iterating over existing ones.

list

dict

set

Comprehensions vs. Generators

Features	列表推导式 `[x for x in data]`	生成器表达式 `(x for x in data)`
内存占用	随数据量线性增长（一次性存入内存）	恒定不变（仅保存计算逻辑）
执行速度	遍历小规模数据时更快	在需要提前退出（如 `any`）时效率更高
可重复性	可以多次访问、切片、索引	一次性对象，遍历完就枯竭
结果类型	返回一个 `list` 对象	返回一个 `generator` 对象

Situations	列表推导式 `[]`	生成器表达式 `()`
需要反复使用结果	是（列表可多次访问）	否（一次性消耗）
需要索引或切片 (如 `res[0]`)	是	否
处理超大数据集（百万级以上）	否（容易内存溢出）	是（内存占用极低）
仅作为函数的参数（如 `sum`）	可选，但效率低	最佳实践

Itertools

A standard Python library module providing a collection of efficient, memory-friendly tools for creating and manipulating iterators.

count
: generates an infinite sequence of incrementing numbers.

cycle
: repeats an iterable indefinitely in a loop.

repeat
: returns the same object repeatedly, either infinitely or a fixed number of times.

chain
: combines multiple iterables into one continuous sequence.

islice
: returns selected elements from an iterable, similar to list slicing but works on iterators.

more
: see itertools.

Iterable and Classes

__iter__ + yield
: automatically handles iterating protocols, no need for __next__.

__iter__ + __next__
: gives you more control over the iteration process.

__getitem__
: allows you to access elements by index and iterate over them.

__iter__ + __reversed__
: provides a reverse iterator.

Hashable and Classes

Concept

In Python, an object is considered Hashable if it meets the following three conditions:

Immutable hash: its hash() value, via __hash__ implementation, remains constant during its entire lifetime.
Comparability: it implements the __eq__ method to allow equality checks.
Equality consistency: if a == b is true, then hash(a) must equal hash(b).

Key Usage: Only hashable objects can be used as dict keys or set elements.

Built-in Types

数据类型	是否可哈希	原因 / 备注
`int`, `float`, `str`, `bool`	是	原子级不可变类型。对象创建后值无法修改。
`tuple`	取决于内容	只有当元组内的所有元素也都是可哈希时，元组才可哈希。
`list`, `dict`, `set`	否	可变容器。内容可以随时修改，这会导致哈希值失效。
`frozenset`	是	集合的不可变版本，专门设计为可哈希。

Hashable Classes

By default, custom classes are hashable based on their memory address. However, if you override __eq__, you must also override __hash__; otherwise, the class becomes unhashable.

Hashable via `dataclass`

To avoid manual implementation, use @dataclass(frozen=True) to automatically make a class hashable.

Pitfalls

Never modify the attributes of a hashable object, as it leads to catastrophic consequences.

Python Review · Iterators, Generators, Comprehensions, Iterable, Hashable

Index `ii`Toggle

Index `ii`Toggle

Iterators

`range`

`enumerate`

`zip`

`map`

`filter`

`reversed`

`Iterator` vs. `Sequence`

Generators

`yield`

Expressions

Iterators vs. Generators

Comprehensions

`list`

`dict`

`set`

Comprehensions vs. Generators

Features

Situations

Itertools

`count`

`cycle`

`repeat`

`chain`

`islice`

more

Iterable and Classes

`iter` + `yield`

`iter` + `next`

`getitem`

`iter` + `reversed`

Hashable and Classes

Concept

Built-in Types

Hashable Classes

Hashable via `dataclass`

Pitfalls

Python Review · Iterators, Generators, Comprehensions, Iterable, Hashable

Index iiToggle

Index iiToggle

Iterators

range

enumerate

zip

map

filter

reversed

Iterator vs. Sequence

Generators

yield

Expressions

Iterators vs. Generators

Comprehensions

list

dict

set

Comprehensions vs. Generators

Features

Situations

Itertools

count

cycle

repeat

chain

islice

more

Iterable and Classes

__iter__ + yield

__iter__ + __next__

__getitem__

__iter__ + __reversed__

Hashable and Classes

Concept

Built-in Types

Hashable Classes

Hashable via dataclass

Pitfalls

Index `ii`Toggle

Index `ii`Toggle

`range`

`enumerate`

`zip`

`map`

`filter`

`reversed`

`Iterator` vs. `Sequence`

`yield`

`list`

`dict`

`set`

`count`

`cycle`

`repeat`

`chain`

`islice`

`iter` + `yield`

`iter` + `next`

`getitem`

`iter` + `reversed`

Hashable via `dataclass`