数据类型 —— Python教程

2020-01-10 15:49:10

通常，定义数据类型的格式都是为了设置数据的上限和下限，以便程序可以很好利用。但是，在 Python 中不需要声明变量就可以让变量持有数据类型，这种功能叫作动态类型。

Python 解释器在运行时会根据语法确定变量的类型。例如，引号（' '）代表声明字符串值，方括号（[ ]）代表声明列表，大括号（{ }）代表声明字典，非小数代表整数（Integer）类型，带小数点代表浮点数（float）类型。

Python 中的所有变量、函数和模块都是对象，即万物皆对象。

下面是 Python 中主要的数据类型：

Booleans
Numbers
Strings
Bytes
Lists
Tuples
Sets
Dictionaries

Booleans（布尔值）

几乎所有的编程语言都有布尔值，布尔值中有两个值，分别是 True 和 False，这些值都是常量，主要用来做赋值和比较操作。例如：

condition = False
if condition == True:
    print("You can continue with the prpgram.")
else:
    print("The program will end here.")

输出的结果是什么取决于这条语句：

if condition:

可以理解成：

if condition == True:

其实，Python 表达式也能产生布尔值的结果。

例如，条件表达式执行完后会产生一个布尔值，那么 Python 就会评估表达式和上下文关系创建布尔值。

由于 Python 具有许多数据结构，因此它们将使用自己的规则进行操作以在布尔值上下文你中查找结果。

>>> str = "Learn Python"
>>> len(str)
12
>>> len(str) == 12
True
>>> len(str) != 12
False

某些情况下，布尔常量 True 和 False 也可以用作数字。

>>> A, B = True + , False + 
>>> print(A, B)
1 
>>> type(A), type(B)
(<class 'int'>, <class 'int'>)

可以看到，True 能作为 1，False 能作为 0，它们在进行算术运算时能被用作数字计算。

Numbers（数字）

数字是程序中运用多的数据类型，Python 不仅有整数和浮点数类型，还引入了 complex（复数）作为一种新型的数字类型。

还需要知道几点：

Python 的数字类型使用三种关键字表示：int、float 和 complex。
使用 type() 内置函数确定变量的数据类型或值。
使用内置函数 isinstance() 测试对象的类型。
在数字后添加 j 或 J 表示复数值。

例如：

num = 2
print("The number (", num, ") is of type", type(num))
num = 3.0
print("The number (", num, ") is of type", type(num))
num = 3+5j
print("The number ", num, " is of type", type(num))
print("The number ", num, " is complex number?", isinstance(3+5j, complex))

输出结果：

The number ( 2 ) is of type <class 'int'>
The number ( 3.0 ) is of type <class 'float'>
The number (3+5j) is of type <class 'complex'>
The number (3+5j) is complex number? True

要形成复数可以使用构造函数实现：

>>> complex(1.2, 5)
(1.2+j)

只要内存可用，Python 的整数类型就没有大小限制。但是可以通过查看系统规定的大小限制：

>>> import sys
>>> sys.maxsize
9223372036854775807

>>> num = 1234567890123456789
>>> num.bit_length()
61
>>> num
1234567890123456789
>>> num = 1234567890123456789123456789012345678912345678901234567891234567890123456789
>>> num.bit_length()
250
>>> num
1234567890123456789123456789012345678912345678901234567891234567890123456789

浮点数类型的的精度高 15 位小数。

>>> import sys
>>> sys.float_info
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)
>>> sys.float_info.dig
15

dig 是浮点数的大小数位数。

Strings（字符串）

用单引号 ' 或双引号 " 引起来的多个字符串序列被视为字符串，任何一个字符、数字或符号都可能是字符串的一部分。

Python 支持多行字符串，在开始和结束时都使用三个引号括起来。

>>> str = 'A string wrapped in single quotes'
>>> str
'A string wrapped in single quotes'
>>> str = "A string enclosed within double quotes"
>>> str
'A string enclosed within double quotes'
>>> str = """A multiline string
starts and ends with
a triple quotation mark."""
>>> str
'A multiline string\nstarts and ends with\na triple quotation mark.'

字符串类型的内存地址是不变的，意味着只会存储一次，被重复拿来使用

>>> A = 'Python3'
>>> id(A)
4460827632
>>> B = A
>>> id(B)
4460827632

可以看到第二个字符串变量和个共享同一个地址。

Python 有两个流行的版本，分别是 2.7 和 3.4，Python 2 默认不支持 Unicode（ASCII），但是也可以支持。而 Python 3 字符串类型已经全部支持 Unicode（UTF-8）。

Python2 字符串类型：

>>> print(type('Python String'))
<type 'str'>
>>> print(type(u'Python Unicode String'))
<type 'unicode'>

Python3 字符串类型：

>>> print(type('Python String'))
<class 'str'>
>>> print(type(u'Python Unicode String'))
<class 'str'>

如果要截取字符串，可以使用特殊的方括号语法提取子串实现：

>>> str = "Learn Python"
>>> first_5_chars = str[:5]
>>> print(first_5_chars)
Learn
>>> substr_from_2_to_5 = str[1:5]
>>> print(substr_from_2_to_5)
earn
>>> substr_from_6_to_end = str[6:]
>>> print(substr_from_6_to_end)
Python
>>> last_2_chars = str[-2:]
>>> print(last_2_chars)
on
>>> first_2_chars = str[:2]
>>> print(first_2_chars)
Le
>>> two_chars_before_last = str[-3:-1]
>>> print(two_chars_before_last)
ho

Bytes（字节）

字节是不可变类型，可以用来存储字符序列（8位），范围从 0 到 255。于数组类型，可以使用索引获取单个字节值，但是不能修改值。

字节和字符串的区别：

字节兑现存储字节序列，字符串对象存储字符序列。
字节是机器可读的，而字符串是人类可读的。
由于字节是机器可读的，所以可以直接存储在磁盘，而字符串是人类可读的，在存储磁盘前需要对字符串编码。

>>> empty_object = bytes(16)
>>> print(type(empty_object))
<class 'bytes'>
>>> print(empty_object)
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

字节经常在缓冲区执行 I/O 操作，例如有一个程序正在通过网络不断接收数据，程序等待消息头和终止符出现流中后开始解析数据，过程中一直保持着将字节追加到环缓存区中。

使用 Python 的伪代码实现这样的功能：

buf = b''
while message_not_complete(buf):
    buf += read_form_socket()

Lists（列表）

列表是类似构造一个数组，它以有序序列存储任意类型的对象。列表非常灵活，没有固定大小，索引从 0 开始。

列表是各种数据类型项目的异构（heterogeneous）集合。例如一个列表对象可以存储文件夹的文件和公司员工的数据。

语法

创建一个列表的方式是通过将元素放在方括号内并用逗号分隔开来。

>>> assorted_list = [True, False, 1, 1.1, 1+2j, 'Learn', b'Python']
>>> first_element = assorted_list[]
>>> print(first_element)
True
>>> print(assorted_list)
[True, False, 1, 1.1, (1+2j), 'Learn', b'Python']
>>> for item in assorted_list:
	print(type(item))

<class 'bool'>
<class 'bool'>
<class 'int'>
<class 'float'>
<class 'complex'>
<class 'str'>
<class 'bytes'>

列表对象是多样的，Python 允许通过赋值或使用列表的内置方法修改列表和元素。

>>> simpleton = ['Learn', 'Python', '2']
>>> id(simpleton)
56321160
>>> simpleton
['Learn', 'Python', '2']
>>> simpleton[2] = '3'
>>> id(simpleton)
56321160
>>> simpleton
['Learn', 'Python', '3']

嵌套

列表可以包含另一个列表，这样的列表被称为嵌套列表。

>>> nested = [[1,1,1], [2,2,2], [3,3,3]]
>>> for items in nested:
	for item in items:
		print(item, end=' ')
		
1 1 1 2 2 2 3 3 3

切片

列表也支持切片操作，就像之前介绍的字符串类型那样。使用切片运算符 [ ] 可以从列表中提取一个或多个元素。

>>> languages = ['C', 'C++', 'Python', 'Java', 'Go', 'Angular']
>>> print('languages[0:3] = ', languages[:3])
languages[:3] =  ['C', 'C++', 'Python']
>>> print('languages[2:] = ', languages[2:])
languages[2:] =  ['Python', 'Java', 'Go', 'Angular']

Tuples（元组）

元组是由逗号分隔的异构（heterogeneous）集合，可以存储任何数据类型。元组和列表有相似之处，具体如下：

都是有序序列
可索引、可重复
允许嵌套使用
可存储任何数据类型

语法

创建一个元组的方式是通过将元素放在封闭的圆括号内并用逗号分隔开来。

定义元组

pure_tuple = ()
print (pure_tuple)

嵌套

first_tuple = (3, 5, 7, 9)
second_tuple = ('learn', 'python 3')
nested_tuple = (first_tuple, second_tuple)
print(nested_tuple)
复制代码

输出结果：

((3, 5, 7, 9), ('learn', 'python 3'))

重复

sample_tuple = ('Python 3',)*3
print(sample_tuple)

输出：

('Python 3', 'Python 3', 'Python 3')

切片

sample_tuple = (0 ,1, 2, 3, 4)

tuple_without_first_item = sample_tuple[1:]
print(tuple_without_first_item)

tuple_reverse = sample_tuple[::-1]
print(tuple_reverse)

tuple_from_3_to_5 = sample_tuple[2:4]
print(tuple_from_3_to_5)

输出结果：

(1, 2, 3, 4)
(4, 3, 2, 1, 0)
(2, 3)

sample_tuple[2:4] 这里的 2 表示从元组第三个元素开始，4 表示到元组第五个元素结束并将其排除在外。

元组和列表的不同点

列表和元组大的不同是列表可变，元组不可变。Python 不允许修改创建后的元组，也就是说不能添加和删除任何元素。所以如果想更新元组中的元素就必须重新创建个新的。

元组中的可变对象可修改

元组中的元素不可修改，但是如果元素是可变对象，那么这个可变对象就是可被修改的。

例如：

>>> sample_tuple = (0 ,[1, 2, 3], 2, 3, 4)
>>> sample_tuple[1][] = 666
>>> print(sample_tuple)
(0, [666, 2, 3], 2, 3, 4)

因为列表是可变对象，所以可修改。

作用和意义

Python 支持元组的原因：

函数使用元组返回多个值。
元组比列表轻。
保存任意数量元素
用作字典的键
保护元素

Sets（集合）

集合支持数学运算，例如并集、交集、对称差等。集合是一个不可变对象的无序集合，用大括号定义，并在元素之间用逗号隔开。

集合是从数学应用的“集合”派生而来，所以同一个元素不能出现多次。

作用

集合类型比列表更有优势。集合使用了哈希表的数据结构实现了检查容器内是否承载了特定元素。

使用

可以使用内置的 set() 函数创建可迭代的集合。

>>> sample_set = set("Python data *")
>>> type(sample_set)
<class 'set'>
>>> sample_set
{'e', 'y', 't', 'o', ' ', 'd', 's', 'P', 'p', 'n', 'h', 'a'}

另一个简单的方式是用大括号 { } 将元素括起来。

>>> another_set = {'red', 'green', 'black'}
>>> type(another_set)
<class 'set'>
>>> another_set
{'red', 'green', 'black'}

Frozen Set

frozen set 是传统集合的一种处理形式，数据不可变，仅支持不更改上下文使用的情况下执行方法和运算符。

>>> frozenset()
frozenset()
>>> cities = {"New York City", "Saint Petersburg", "London", "Munich", "Paris"}
>>> fset = frozenset(cities)
>>> type(fset)
<class 'frozenset'>

使用完整的例子说明 frozen set 和正常集合的区别：

sample_set = {"red", "green"}
sample_set.add("black")
print("Standard Set")
print(sample_set)
 
frozen_set = frozenset(["red", "green", "black"])
print("Frozen Set")
print(frozen_set)

输出结果：

Standard Set
{'green', 'red', 'black'}
Frozen Set
frozenset({'green', 'red', 'black'})

Dictionaries（字典）

字典类型是键-值对的无序集合，属于内置的映射类型，其中键映射到值。这种键值对提供了直观的数据存储方式。

作用

字典能有效存储大量数据集，Python 对字典进行了高度优化以实现快速的数据检索。

使用

使用大括号 { } 创建字典，其中每个元素都是一对键和值，键和值都可以是任何数据类型。

>>> sample_dict = {'key':'value', 'jan':31, 'feb':28, 'mar':31}
>>> type(sample_dict)
<class 'dict'>
>>> sample_dict
{'mar': 31, 'key': 'value', 'jan': 31, 'feb': 28}

访问元素

字典内置访问元素方法：

keys() —— 获取字典的键。
values() —— 获取字典键对应的值。
items() —— 获得整个元素列表，包括键和值。

>>> sample_dict = {'key':'value', 'jan':31, 'feb':28, 'mar':31}
>>> sample_dict.keys()
dict_keys(['key', 'jan', 'feb', 'mar'])
>>> sample_dict.values()
dict_values(['value', 31, 28, 31])
>>> sample_dict.items()
dict_items([('key', 'value'), ('jan', 31), ('feb', 28), ('mar', 31)])

修改字典（添加/更新/删除）

因为字典对象是可变的，所以可以对其进行添加、更新和删除操作。

>>> sample_dict['feb'] = 29
>>> sample_dict
{'mar': 31, 'key': 'value', 'jan': 31, 'feb': 29}
>>> sample_dict.update({'apr':30})
>>> sample_dict
{'apr': 30, 'mar': 31, 'key': 'value', 'jan': 31, 'feb': 29}
>>> del sample_dict['key']
>>> sample_dict
{'apr': 30, 'mar': 31, 'jan': 31, 'feb': 29}