python生成器(Python generator)

生成器

我们学习完推导式之后发现,推导式就是在容器中使用一个for循环而已,为什么没有元组推导式?

原因就是“元组推导式”的名字不是这样的,而是叫做生成器表达式。

什么是生成器

生成器表达式本质上就是一个迭代器,是定义迭代器的一种方式,是允许自定义逻辑的迭代器。生成器使用generator表示。

迭代器和生成器的区别

迭代器本身是系统内置的, 无法重写内置的逻辑结构;而生成器是用户自定义的,可以重写逻辑结构。所以生成器就是一个迭代器,只是我们将自己写的迭代器叫做生成器作为区分而已。

创建方式

生成器有两种创建方式

  • 生成器表达式,就是“元组推导式”
  • 生成器函数,就是使用def定义,里面使用yield关键字

生成器表达式

基本语法

from collections import Iterator, Iterable

# 生成器表达式(元组推导式)
gen = (i * 2 for i in range(1, 11))
print(isinstance(gen, Iterable))  # 判断是否是迭代对象
print(isinstance(gen, Iterator))  # 判断是否是迭代器

# 这个 gen 就是生成器

生成器函数

我们上面说到,生成器函数如何定义?其实和普通的函数定义的方法是一样的,都是要使用关键字来定义,其它的写法没有任何要求,普通函数怎么写生成器函数就怎么写,唯一的要求就是要使用关键字。

def
yield

要注意,生成器函数就是一个函数,是使用了yield的函数,只不过生成器函数是用来定义生成器的。

yield关键字

这个关键字其实类似于关键字,关键字的作用是在函数中使用,用来返回数据,关键字的作用也是一样的,就是用来返回数据,但是和还有其它的不同之处。

yield
return
return
yield
return

yield和return

共同点

执行到对应语句的时候,就会返回对应的值。

不同点

执行的时候,函数就跳出,然后之后的所有作用域语句就会全部跳出,当函数再次调用的时候,整个函数就重新执行。

return
return

执行的时候,返回数据,但是函数就会记住跳出的位置,当你再次调用函数(生成器)的时候,就从上一次跳出的地方继续执行,是不是和迭代器的取值有异曲同工之处?

yield

yield的使用方法

yield的使用方法有两种,一种是和return的使用方法一样,在关键字的后面直接添加返回值,这是推荐使用的方法;

第二种方法使用将yield作为一个函数使用,就是在yield后面使用括号,在括号中填写返回的值。

生成器函数的基本使用

# 1、定义一个生成器函数
# 生成器函数就是一个使用yield的函数
def myGen():
	print(1)
	yield 11
	print(2)
	yield 22
	print(3)
	yield 33


# 2、初始化生成器
# 执行生成器函数,返回一个对象,就是生成器对象,简称生成器
from collections import Iterator
gen = myGen()
res = isinstance(gen, Iterator)
print(res)  # True  返回True说明生成器本质上就是一个迭代器


# 3、调用生成器
# 生成器本质上就是一个迭代器,还记得迭代器如何调用吗?
res = next(gen)
print(res)
"""
结果:
1   (生成器函数中的语句 print(1))
11  (yield返回的值,print(res))
"""

send的使用

和一样,都是用来取出迭代器中的值的函数,是生成器的内置函数。而且send和next相比,功能更加的强大,next只能取值;send不但能取值,而且还能发送值。

send
next
send

实例

定义生成器函数

def myGen():

   print('process start')
   #  res获取yield的值
   res = yield 100

   print(res, '内部打印1')
   print('process start')
   res = yield 200

   print(res, '内部打印2')
   print('process start')
   res = yield 300

   print(res, '内部打印3')

初始化生成器

gen = myGen()

第一次调用生成器

# 在使用send时,第一次传递的数据必须是None,这是硬性语法,以为send第一次传递参数的时候,还没有遇到yield,所以不能传送。
res = gen.send(None)

print(res)
"""
结果:
process start
100
"""

使用send第一次调用生成器的时候执行了下面的语句:
print(‘process start’)
res = yield 100

执行到yield 100的时候,才碰到了yield,但是send之前没有遇到过yield,所以不能传入任何值,None没有任何意义,这是硬性语法。
这里注意,中的res此时没有任何价值。因为这个一条语句我们目前只执行了一半,执行了,还有res的赋值没有完成,所以现在的res没有任何的意义。
第一次调用生成器,返回100,这个100则是语句返回的值。

使用send第一次调用生成器的时候执行了下面的语句:

print('process start')
res = yield 100

执行到yield 100的时候,才碰到了yield,但是send之前没有遇到过yield,所以不能传入任何值,None没有任何意义,这是硬性语法。

这里注意,中的res此时没有任何价值。因为这个一条语句我们目前只执行了一半,执行了,还有res的赋值没有完成,所以现在的res没有任何的意义。

res = yield 100
yield 100

第一次调用生成器,返回100,这个100则是语句返回的值。

res = yield 100

第二次调用

res = next(gen)
print(res)
"""
结果:
None 内部打印1
process start
200
"""

第二次调用执行了以下语句:
res = yield 100
print(res, ‘内部打印1’)
print(‘process start’)
res = yield 200

注意,生成器函数在调用的时候,会从上一次yield返回值的地方,就是,但是这个语句第二次调用的时候,只会执行一半,因为另一半在第一次调用的时候已经执行完了,就是,就是说还有res的赋值没有进行,但是第二次调用使用的是next,next没有传送值的能力,所以res就没有赋予任何值,,在打印的时候,res就是一个None。

第二次调用执行了以下语句:

res = yield 100
print(res, '内部打印1')
print('process start')
res = yield 200

注意,生成器函数在调用的时候,会从上一次yield返回值的地方,就是,但是这个语句第二次调用的时候,只会执行一半,因为另一半在第一次调用的时候已经执行完了,就是,就是说还有res的赋值没有进行,但是第二次调用使用的是next,next没有传送值的能力,所以res就没有赋予任何值,,在打印的时候,res就是一个None。

res = yield 100
yield 100

第三次调用

res = gen.send('第三次调用')
print(res)
"""
结果:
第三次调用 内部打印2
process start
300
"""

第三次调用执行的语句是:
res = yield 200
print(res, ‘内部打印2’)
print(‘process start’)
res = yield 300

这次和第二次的调用基本是一样的,但是这次是使用send调用,所以传送了值过去,执行于是将值赋予了res。

第三次调用执行的语句是:

res = yield 200
print(res, '内部打印2')
print('process start')
res = yield 300

这次和第二次的调用基本是一样的,但是这次是使用send调用,所以传送了值过去,执行于是将值赋予了res。

第四次调用

res = gen.send(None)
print(res)

"""
结果:
None 内部打印3
StopIteration  (报错)
"""

第四次调用,执行以下语句:
res = yield 300
print(res, ‘内部打印3’)

第四次调用生成器,没有可以执行的yield语句,所以返回不了任何数据,因此报出了 的错误。

第四次调用,执行以下语句:

res = yield 300
print(res, '内部打印3')

第四次调用生成器,没有可以执行的yield语句,所以返回不了任何数据,因此报出了 的错误。

StopIteration

可迭代对象的优化

现在我们就已经学习完了容器和迭代器、生成器的相关知识,我们也知道了可迭代对象和迭代器的区别,那么现在我们要说的是,如果我们需要制定一个容器供我们遍历使用,那么我们优先使用迭代器而不是容器这样的一个普通的可迭代对象。

在我们之后的日常使用过程当中,我们有时就会发现,我们需要在一个循环中遍历一个容器供我们使用,但是这个容器中的值非常多,使这个容器占据的内存空间非常大,消耗了大量的资源,导致我们的程序非常慢。这个时候我们就需要使用迭代器或者生成器去遍历,迭代器每次遍历只占据当次遍历时的内存空间,因此非常的节省资源,所以这就是我们优先使用迭代器的理由。

总结

现在我们就学习完了python中的所有的函数类型,知道了python中的有内置函数、自定义函数,之后我们还会学习一些python的常用标准库和第三方库,里面也有一些我们经常用到的函数。

  • 普通函数,使用def定义
  • 匿名函数,使用lambda定义
  • 闭包函数,内函数调用外函数的变量,并且外函数将内函数返回,这样的嵌套下,外函数就是一个闭包函数,但是一般的情况下,我们并不特意的作出一个闭包函数,而是要使用闭包这么一个功能
  • 高阶函数,就是将函数作为参数使用的函数,常用的内置高阶函数有map、filter、reduce、sorted
  • 递归函数,自己调用自己的函数
————————

generator

After learning the derivation, we found that the derivation is just using a for loop in the container. Why is there no tuple derivation?

The reason is that the name of “tuple derivation” is not like this, but called generator expression.

What is a generator

Generator expression is essentially an iterator. It is a way to define iterators and an iterator that allows custom logic. Generators are represented by generators.

The difference between iterators and generators

The iterator itself is built-in in the system and cannot rewrite the built-in logical structure; The generator is user-defined and can rewrite the logical structure. Therefore, the generator is an iterator, but we call the iterator we write ourselves a generator as a distinction.

Creation method

Generators can be created in two ways

  • Generator expression is “tuple derivation”
  • The generator function is defined by def, which uses the yield keyword

Generator Expressions

Basic grammar

from collections import Iterator, Iterable

# 生成器表达式(元组推导式)
gen = (i * 2 for i in range(1, 11))
print(isinstance(gen, Iterable))  # 判断是否是迭代对象
print(isinstance(gen, Iterator))  # 判断是否是迭代器

# 这个 gen 就是生成器

generator function

As we mentioned above, how is the generator function defined? In fact, the method of defining functions is the same as that of ordinary functions. Keywords are used to define functions. There are no requirements for other writing methods. Ordinary functions can be written as generator functions. The only requirement is to use keywords.

def
yield

Note that the < strong > generator function is a function that uses yield, but the generator function is used to define the generator

yield关键字

This keyword is actually similar to the keyword. The function of the keyword is used in the function to return data. The function of the keyword is the same, that is, to return data, but it is different from other keywords.

yield
return
return
yield
return

yield和return

common ground

When the corresponding statement is executed, the corresponding value will be returned.

difference

When executing, the function will jump out, and then all subsequent scope statements will jump out. When the function is called again, the whole function will be executed again.

return
return

When executing, it returns data, but the function will remember the jump position. When you call the function (generator) again, you will continue to execute from the last jump place. Is it similar to the value of the iterator?

yield

How to use yield

There are two ways to use yield. One is to add the return value directly after the keyword, which is the recommended method;

The second method uses yield as a function, that is, use parentheses after yield and fill in the returned value in parentheses.

Basic use of generator functions

# 1、定义一个生成器函数
# 生成器函数就是一个使用yield的函数
def myGen():
	print(1)
	yield 11
	print(2)
	yield 22
	print(3)
	yield 33


# 2、初始化生成器
# 执行生成器函数,返回一个对象,就是生成器对象,简称生成器
from collections import Iterator
gen = myGen()
res = isinstance(gen, Iterator)
print(res)  # True  返回True说明生成器本质上就是一个迭代器


# 3、调用生成器
# 生成器本质上就是一个迭代器,还记得迭代器如何调用吗?
res = next(gen)
print(res)
"""
结果:
1   (生成器函数中的语句 print(1))
11  (yield返回的值,print(res))
"""

send的使用

Like, it is a function used to fetch the value in the iterator. It is a built-in function of the generator. Moreover, the function of send is more powerful than that of next, and next can only take values; Send can not only take values, but also send values.

send
next
send

example

Define generator functions

def myGen():

   print('process start')
   #  res获取yield的值
   res = yield 100

   print(res, '内部打印1')
   print('process start')
   res = yield 200

   print(res, '内部打印2')
   print('process start')
   res = yield 300

   print(res, '内部打印3')

Initialize generator

gen = myGen()

First call to generator

# 在使用send时,第一次传递的数据必须是None,这是硬性语法,以为send第一次传递参数的时候,还没有遇到yield,所以不能传送。
res = gen.send(None)

print(res)
"""
结果:
process start
100
"""

When using send to call the generator for the first time, execute the following statement:
print(‘process start’)
res = yield 100
When you execute yield 100, you encounter yield, but send has never encountered yield before, so you can’t pass in any value. None has no meaning. This is a hard syntax.
Note here that res in has no value at this time. Because we have only executed half of this statement, and the assignment of res has not been completed, so the current res has no meaning.
The first call to the generator returns 100, which is the value returned by the statement.

When using send to call the generator for the first time, execute the following statement:

print('process start')
res = yield 100

When you execute yield 100, you encounter yield, but send has never encountered yield before, so you can’t pass in any value. None has no meaning. This is a hard syntax.

Note here that res in has no value at this time. Because we have only executed half of this statement, and the assignment of res has not been completed, so the current res has no meaning.

res = yield 100
yield 100

The first call to the generator returns 100, which is the value returned by the statement.

res = yield 100

Second call

res = next(gen)
print(res)
"""
结果:
None 内部打印1
process start
200
"""

The second call executes the following statement:
res = yield 100
Print (RES, ‘internal print 1’)
print(‘process start’)
res = yield 200
Note that when the generator function is called, it will return the value from the last yield, that is, but the second call of this statement will only execute half, because the other half has been executed in the first call, that is, there is still no assignment of res, but the second call uses next, which has no ability to transfer the value, so res does not give any value,, When printing, res is a none.

The second call executes the following statement:

res = yield 100
print(res, '内部打印1')
print('process start')
res = yield 200

Note that when the generator function is called, it will return the value from the last yield, that is, but the second call of this statement will only execute half, because the other half has been executed in the first call, that is, there is still no assignment of res, but the second call uses next, which has no ability to transfer the value, so res does not give any value,, When printing, res is a none.

res = yield 100
yield 100

Third call

res = gen.send('第三次调用')
print(res)
"""
结果:
第三次调用 内部打印2
process start
300
"""

The statement executed in the third call is:
res = yield 200
Print (RES, ‘internal print 2’)
print(‘process start’)
res = yield 300
This call is basically the same as the second call, but this time it uses the send call, so the value is passed in the past, and the execution gives the value to res.

The statement executed in the third call is:

res = yield 200
print(res, '内部打印2')
print('process start')
res = yield 300

This call is basically the same as the second call, but this time it uses the send call, so the value is passed in the past, and the execution gives the value to res.

Fourth call

res = gen.send(None)
print(res)

"""
结果:
None 内部打印3
StopIteration  (报错)
"""

For the fourth call, execute the following statement:
res = yield 300
Print (RES, ‘internal print 3’)
The fourth time the generator is called, there is no yield statement that can be executed, so no data can be returned, so an error is reported.

For the fourth call, execute the following statement:

res = yield 300
print(res, '内部打印3')

The fourth time the generator is called, there is no yield statement that can be executed, so no data can be returned, so an error is reported.

StopIteration

Optimization of iterative objects

Now we have learned about containers, iterators and generators, and we also know the difference between iteratable objects and iterators. Now we want to say that if we need to formulate a container for us to traverse, we give priority to iterators rather than ordinary iteratable objects such as containers.

During our daily use later, we sometimes find that we need to traverse a container in a loop for our use, but there are many values in this container, which makes the container occupy a large memory space and consume a lot of resources, resulting in our program being very slow. At this time, we need to use an iterator or generator to traverse. Each iteration of the iterator only occupies the memory space of the current iteration, so it saves resources very much. Therefore, this is the reason why we give priority to using the iterator.

summary

Now we have learned all the function types in Python. We know that there are built-in functions and custom functions in Python. Then we will learn some common standard libraries and third-party libraries of python, which also have some functions we often use.

  • Ordinary functions, defined with def
  • Anonymous function, defined by lambda
  • Closure function: the inner function calls the variables of the outer function, and the outer function returns the inner function. Under such nesting, the outer function is a closure function, but generally, we do not deliberately make a closure function, but use the function of closure
  • High order functions are functions that use functions as parameters. Commonly used built-in high-order functions include map, filter, reduce and sorted
  • Recursive function, call your own function