0911-python-分布式学习

it2022-05-05 87

一、关于python中的setup.py

二.python os.path模块常用方法详解

os.path模块主要用于文件的属性获取，在编程中经常用到，以下是该模块的几种常用方法。更多的方法可以去查看官方文档：http://docs.python.org/library/os.path.html

1.os.path.abspath(path) 返回path规范化的绝对路径。 >>> os.path.abspath('test.csv') 'C:\\Python25\\test.csv' >>> os.path.abspath('c:\\test.csv') 'c:\\test.csv' >>> os.path.abspath('../csv\\test.csv') 'C:\\csv\\test.csv' 2.os.path.split(path) 将path分割成目录和文件名二元组返回。 >>> os.path.split('c:\\csv\\test.csv') ('c:\\csv', 'test.csv') >>> os.path.split('c:\\csv\\') ('c:\\csv', '') 3.os.path.dirname(path) 返回path的目录。其实就是os.path.split(path)的第一个元素。 >>> os.path.dirname('c:\\csv\test.csv') 'c:\\' >>> os.path.dirname('c:\\csv') 'c:\\' 4.os.path.basename(path) 返回path最后的文件名。如何path以／或\结尾，那么就会返回空值。即os.path.split(path)的第二个元素。 >>> os.path.basename('c:\\test.csv') 'test.csv' >>> os.path.basename('c:\\csv') 'csv' （这里csv被当作文件名处理了） >>> os.path.basename('c:\\csv\\') '' 5.os.path.commonprefix(list) 返回list中，所有path共有的最长的路径。如： >>> os.path.commonprefix(['/home/td','/home/td/ff','/home/td/fff']) '/home/td' 6.os.path.exists(path) 如果path存在，返回True；如果path不存在，返回False。 >>> os.path.exists('c:\\') True >>> os.path.exists('c:\\csv\\test.csv') False 7.os.path.isabs(path) 如果path是绝对路径，返回True。 8.os.path.isfile(path) 如果path是一个存在的文件，返回True。否则返回False。 >>> os.path.isfile('c:\\boot.ini') True >>> os.path.isfile('c:\\csv\\test.csv') False >>> os.path.isfile('c:\\csv\\') False 9.os.path.isdir(path) 如果path是一个存在的目录，则返回True。否则返回False。 >>> os.path.isdir('c:\\') True >>> os.path.isdir('c:\\csv\\') False >>> os.path.isdir('c:\\windows\\test.csv') False 10.os.path.join(path1[, path2[, ...]]) 将多个路径组合后返回，第一个绝对路径之前的参数将被忽略。 >>> os.path.join('c:\\', 'csv', 'test.csv') 'c:\\csv\\test.csv' >>> os.path.join('windows\temp', 'c:\\', 'csv', 'test.csv') 'c:\\csv\\test.csv' >>> os.path.join('/home/aa','/home/aa/bb','/home/aa/bb/c') '/home/aa/bb/c' 11.os.path.normcase(path) 在Linux和Mac平台上，该函数会原样返回path，在windows平台上会将路径中所有字符转换为小写，并将所有斜杠转换为饭斜杠。 >>> os.path.normcase('c:/windows\\system32\\') 'c:\\windows\\system32\\' 12.os.path.normpath(path) 规范化路径。 >>> os.path.normpath('c://windows\\System32\\../Temp/') 'c:\\windows\\Temp' 12.os.path.splitdrive(path) 返回（drivername，fpath）元组 >>> os.path.splitdrive('c:\\windows') ('c:', '\\windows') 13.os.path.splitext(path) 分离文件名与扩展名；默认返回(fname,fextension)元组，可做分片操作 >>> os.path.splitext('c:\\csv\\test.csv') ('c:\\csv\\test', '.csv') 14.os.path.getsize(path) 返回path的文件的大小（字节）。 >>> os.path.getsize('c:\\boot.ini') 299L 15.os.path.getatime(path) 返回path所指向的文件或者目录的最后存取时间。 16.os.path.getmtime(path) 返回path所指向的文件或者目录的最后修改时间

三、 sys.path.insert()用法

四、map,reduce

>>> from functools import reduce >>> def fn(x, y): ... return x * 10 + y ... >>> def char2num(s): ... return {'0': 0, '1': 1, '2': 2, '3': 3, '4': 4, '5': 5, '6': 6, '7': 7, '8': 8, '9': 9}[s] ... >>> reduce(fn, map(char2num, '13579')) 13579

五、lambda

　　在学习python的过程中，lambda的语法时常会使人感到困惑，lambda是什么，为什么要使用lambda，是不是必须使用lambda？

　　下面就上面的问题进行一下解答。

　　1、lambda是什么？

　　　　看个例子：　　　　　

1 g = lambda x:x+1

　　看一下执行的结果：　

　　g(1)

　　>>>2

　　g(2)

　　>>>3

　　当然，你也可以这样使用：

　　lambda x:x+1(1)

　　>>>2　　　

　　可以这样认为,lambda作为一个表达式，定义了一个匿名函数，上例的代码x为入口参数，x+1为函数体，用函数来表示为：

1 def g(x):2 return x+1

　　非常容易理解，在这里lambda简化了函数定义的书写形式。是代码更为简洁，但是使用函数的定义方式更为直观，易理解。

　　Python中，也有几个定义好的全局函数方便使用的，filter, map, reduce　　

>>> foo = [2, 18, 9, 22, 17, 24, 8, 12, 27]>>>>>> print filter(lambda x: x % 3 == 0, foo)[18, 9, 24, 12, 27]>>>>>> print map(lambda x: x * 2 + 10, foo)[14, 46, 28, 54, 44, 58, 26, 34, 64]>>>>>> print reduce(lambda x, y: x + y, foo)139

　　上面例子中的map的作用，非常简单清晰。但是，Python是否非要使用lambda才能做到这样的简洁程度呢？在对象遍历处理方面，其实Python的for..in..if语法已经很强大，并且在易读上胜过了lambda。

　　比如上面map的例子，可以写成：

　　　　print [x * 2 + 10 for x in foo]

　　非常的简洁，易懂。

　　filter的例子可以写成：

　　　　print [x for x in foo if x % 3 == 0]

　　同样也是比lambda的方式更容易理解。

　　上面简要介绍了什么是lambda,下面介绍为什么使用lambda,看一个例子（来自apihelper.py)：　　

processFunc = collapse and (lambda s: " ".join(s.split())) or (lambda s: s)

　　在Visual Basic，你很有可能要创建一个函数，接受一个字符串参数和一个 collapse 参数，并使用 if 语句确定是否压缩空白，然后再返回相应的值。这种方式是低效的，因为函数可能需要处理每一种可能的情况。每次你调用它，它将不得不在给出你所想要的东西之前，判断是否要压缩空白。在 Python 中，你可以将决策逻辑拿到函数外面，而定义一个裁减过的 lambda 函数提供确切的 (唯一的) 你想要的。这种方式更为高效、更为优雅，而且很少引起那些令人讨厌 (哦，想到那些参数就头昏) 的错误。

　　通过此例子，我们发现，lambda的使用大量简化了代码，使代码简练清晰。但是值得注意的是，这会在一定程度上降低代码的可读性。如果不是非常熟悉python的人或许会对此感到不可理解。

　　lambda 定义了一个匿名函数

　　lambda 并不会带来程序运行效率的提高，只会使代码更简洁。

　　如果可以使用for...in...if来完成的，坚决不用lambda。

　　如果使用lambda，lambda内不要包含循环，如果有，我宁愿定义函数来完成，使代码获得可重用性和更好的可读性。

六、strip

Python strip() 方法用于移除字符串头尾指定的字符（默认为空格）。

语法

strip()方法语法：

str.strip([chars]);

参数

chars -- 移除字符串头尾指定的字符。

返回值

返回移除字符串头尾指定的字符生成的新字符串。

实例

以下实例展示了strip()函数的使用方法：

#!/usr/bin/python str = "0000000this is string example....wow!!!0000000"; print str.strip( '0' );

以上实例输出结果如下：

this is string example....wow!!!七、yield

为了理解yield的机制，我们需要理解什么是生成器。在此之前先介绍迭代器iterables。

Iterables

当你创建一个list,你可以一个一个的获取，这种列表就称为迭代：

>>> mylist = [1, 2, 3] >>> for i in mylist: ... print(i) 1 2 3

Mylist 是一个迭代器. 当你理解它为一个list，它便是可迭代的:

>>> mylist = [x*x for x in range(3)] >>> for i in mylist: ... print(i) 0 1 4

任何可以用 for in 来迭代读取的都是迭代容器，例如lists,strings,files.这些迭代器非常的便利，因为你可以想取多少便取多少，但是你得存储所有的值，其中很多值都完全没有必要每次都保持在内存中。

Generators

Generators(生成器)也是可迭代的，但是你每次只能迭代它们一次，因为不是所有的迭代器都被一直存储在内存中的，他们临时产生这些值：

>>> mygenerator = (x*x for x in range(3)) >>> for i in mygenerator: ... print(i) 0 1 4

生成器几乎和迭代器是相同的，除了符号[]变为()。但是你无法用两次，因为他们只生成一次：他们生成0然后丢弃，继续统计1，接着是4，一个接着一个。

Yield

Yield的用法有点像return,除了它返回的是一个生成器，例如：

>>> def createGenerator(): ... mylist = range(3) ... for i in mylist: ... yield i*i ... >>> mygenerator = createGenerator() # create a generator >>> print(mygenerator) # mygenerator is an object! <generator object createGenerator at 0xb7555c34> >>> for i in mygenerator: ... print(i) 0 1 4

上面的例子几乎非常积累，但是它很好的阐释了yield的用法，我们可以知道createGenerator()生成的是一个生成器。

为了掌握yield的精髓，你一定要理解它的要点：当你调用这个函数的时候，你写在这个函数中的代码并没有真正的运行。这个函数仅仅只是返回一个生成器对象。有点过于奇技淫巧:-)

然后，你的代码会在每次for使用生成器的时候run起来。

现在是解释最难的地方：

当你的for第一次调用函数的时候，它生成一个生成器，并且在你的函数中运行该循环，知道它生成第一个值。然后每次调用都会运行循环并且返回下一个值，知道没有值返回为止。该生成器背认为是空的一旦该函数运行但是不再刀刀yield。之所以如此是因为该循环已经到达终点，或者是因为你再也不满足“if/else”的条件。

Your code explained

例子：生成器： # 这里你创建一个node对象的一个生成器生成方法Here you create the method of the node object that will return the generator def node._get_child_candidates(self, distance, min_dist, max_dist): # 这里是每次被调用的代码Here is the code that will be called each time you use the generator object: # 如果还有一个左孩子节点If there is still a child of the node object on its left # 并且距离可以，返回下一个孩子节点AND if distance is ok, return the next child if self._leftchild and distance - max_dist < self._median: yield self._leftchild # 如果还有一个右孩子几点If there is still a child of the node object on its right # 并且距离可以，返回下一个孩子节点AND if distance is ok, return the next child if self._rightchild and distance + max_dist >= self._median: yield self._rightchild # 如果方法运行到这里，生成器会被认为为空If the function arrives here, the generator will be considered empty # there is no more than two values: the left and the right children

调用者:

# 创建一个空的列表Create an empty list and a list with the current object reference result, candidates = list(), [self] # 循环candidates列表,只有一个元素。Loop on candidates (they contain only one element at the beginning) while candidates: # Get the last candidate and remove it from the list node = candidates.pop() # Get the distance between obj and the candidate distance = node._get_dist(obj) # If distance is ok, then you can fill the result if distance <= max_dist and distance >= min_dist: result.extend(node._values) # Add the children of the candidate in the candidates list # so the loop will keep running until it will have looked # at all the children of the children of the children, etc. of the candidate candidates.extend(node._get_child_candidates(distance, min_dist, max_dist)) return result

这段代码包含一些非常机智的部分：

1. list的循环迭代部分，但是list在循环的同时又在拓展，：）这种方法是一种循环内嵌式的数据的相对简洁的方法，但是又存在着一些风险可能会导致死循环的情况。在这个例子当中，candidates.extend(node._get_child_candidates(distance, min_dist, max_dist)) 耗尽所有的的生成器的值，但是当保持生成新的生成器对象，并且依据之前生成器产生许多不同的值，由于它产生于不同的节点。

2. extend()方法是一个list 对象方法，它产生一个迭代器并且添加它的值到list当中去。

通常我们

>>> a = [1, 2] >>> b = [3, 4] >>> a.extend(b) >>> print(a) [1, 2, 3, 4]

但是代码中获得一个生成器，这种方式比较好的原因如下：

首先是你无须读取该值两次。

然后你不需要把所有的值都放在内存中。

与此同时，它能够owrk的原因是python不关心一个方法的参数石佛是一个list.期待是一个迭代器所以它能够适用于strings,lists,tuples以及生成器。这被称为动态类型或者鸭子类型（duck typing）是python 如此酷的一大原因。鸭子类型又是另外一个问题了，blablabla。

现在让我们来看看一些高级的用法：

控制生成器资源消耗：

>>> class Bank(): # let's create a bank, building ATMs ... crisis = False ... def create_atm(self): ... while not self.crisis: ... yield "$100" >>> hsbc = Bank() # when everything's ok the ATM gives you as much as you want >>> corner_street_atm = hsbc.create_atm() >>> print(corner_street_atm.next()) $100 >>> print(corner_street_atm.next()) $100 >>> print([corner_street_atm.next() for cash in range(5)]) ['$100', '$100', '$100', '$100', '$100'] >>> hsbc.crisis = True # crisis is coming, no more money! >>> print(corner_street_atm.next()) <type 'exceptions.StopIteration'> >>> wall_street_atm = hsbc.create_atm() # it's even true for new ATMs >>> print(wall_street_atm.next()) <type 'exceptions.StopIteration'> >>> hsbc.crisis = False # trouble is, even post-crisis the ATM remains empty >>> print(corner_street_atm.next()) <type 'exceptions.StopIteration'> >>> brand_new_atm = hsbc.create_atm() # build a new one to get back in business >>> for cash in brand_new_atm: ... print cash $100 $100 $100 $100 $100 $100 $100 $100 $100 ...

这一个非常的有用，特别是类似的资源访问控制。

Itertools模块

Itertools模块包含一些特别的函数去执行迭代器。有没有想过去复制一个生成器或者链接两个生成器?等等。

引入itertools就好了，import itertools.

下面举个例子.看看四匹马到达先后顺序的例子：

>>> horses = [1, 2, 3, 4] >>> races = itertools.permutations(horses) >>> print(races) <itertools.permutations object at 0xb754f1dc> >>> print(list(itertools.permutations(horses))) [(1, 2, 3, 4), (1, 2, 4, 3), (1, 3, 2, 4), (1, 3, 4, 2), (1, 4, 2, 3), (1, 4, 3, 2), (2, 1, 3, 4), (2, 1, 4, 3), (2, 3, 1, 4), (2, 3, 4, 1), (2, 4, 1, 3), (2, 4, 3, 1), (3, 1, 2, 4), (3, 1, 4, 2), (3, 2, 1, 4), (3, 2, 4, 1), (3, 4, 1, 2), (3, 4, 2, 1), (4, 1, 2, 3), (4, 1, 3, 2), (4, 2, 1, 3), (4, 2, 3, 1), (4, 3, 1, 2), (4, 3, 2, 1)]

最后是理解迭代器的内部机制：

Iteration is a process implying iterables (implementing the __iter__() method) and iterators (implementing the __next__() method). Iterables are any objects you can get an iterator from. Iterators are objects that let you iterate on iterables.

转载于:https://www.cnblogs.com/mesakiiyui/p/7504926.html

相关资源：Python-介绍了分布式系统

专利

最新回复(0)