时间:2020-07-20 python爬虫 查看: 1101
1.re.match()
re.match()的概念是从头匹配一个符合规则的字符串,从起始位置开始匹配,匹配成功返回一个对象,未匹配成功返回None。
包含的参数如下:
pattern: 正则模型
string : 要匹配的字符串
falgs : 匹配模式
match() 方法一旦匹配成功,就是一个match object对象,而match object对象有以下方法:
group() 返回被 RE 匹配的字符串
start() 返回匹配开始的位置
end() 返回匹配结束的位置
span()返回一个元组包含匹配 (开始,结束) 的位置
案例:
import re
# re.match 返回一个Match Object 对象
# 对象提供了 group() 方法,来获取匹配的结果
result = re.match("hello","hello,world")
if result:
print(result.group())
else:
print("匹配失败!")
输出结果:
hello
2.re.search()
re.search()函数会在字符串内查找模式匹配,只要找到第一个匹配然后返回,如果字符串没有匹配,则返回None。
格式:re.search(pattern, string, flags=0)
要求:匹配出文章阅读的次数
import re
ret = re.search(r"\d+", "阅读次数为 9999")
print(ret.group())
输出结果:
9999
3.match()和search()的区别:
match()函数只检测RE是不是在string的开始位置匹配,
search()会扫描整个string查找匹配
match()只有在0位置匹配成功的话才有返回,如果不是开始位置匹配成功的话,match()就返回none
举例说明:
import re
print(re.match('super', 'superstition').span())
(0, 5)
print(re.match('super','insuperable'))
None
print(re.search('super','superstition').span())
(0, 5)
print(re.search('super','insuperable').span())
(2, 7)
补充知识: jupyter notebook_主函数文件如何调用类文件
使用jupyter notebook编写python程序,rw_visual.jpynb是写的主函数,random_walk.jpynb是类(如图)。在主函数中将类实例化后运行会报错,经网络查找解决了问题,缺少Ipynb_importer.py这样一个链接文件。
解决方法:
1、在同一路径下创建名为Ipynb_importer.py的文件:File-->download as-->Python(.py),该文件内容如下:
#!/usr/bin/env python
# coding: utf-8
# In[ ]:
import io, os,sys,types
from IPython import get_ipython
from nbformat import read
from IPython.core.interactiveshell import InteractiveShell
class NotebookFinder(object):
"""Module finder that locates Jupyter Notebooks"""
def __init__(self):
self.loaders = {}
def find_module(self, fullname, path=None):
nb_path = find_notebook(fullname, path)
if not nb_path:
return
key = path
if path:
# lists aren't hashable
key = os.path.sep.join(path)
if key not in self.loaders:
self.loaders[key] = NotebookLoader(path)
return self.loaders[key]
def find_notebook(fullname, path=None):
"""find a notebook, given its fully qualified name and an optional path
This turns "foo.bar" into "foo/bar.ipynb"
and tries turning "Foo_Bar" into "Foo Bar" if Foo_Bar
does not exist.
"""
name = fullname.rsplit('.', 1)[-1]
if not path:
path = ['']
for d in path:
nb_path = os.path.join(d, name + ".ipynb")
if os.path.isfile(nb_path):
return nb_path
# let import Notebook_Name find "Notebook Name.ipynb"
nb_path = nb_path.replace("_", " ")
if os.path.isfile(nb_path):
return nb_path
class NotebookLoader(object):
"""Module Loader for Jupyter Notebooks"""
def __init__(self, path=None):
self.shell = InteractiveShell.instance()
self.path = path
def load_module(self, fullname):
"""import a notebook as a module"""
path = find_notebook(fullname, self.path)
print ("importing Jupyter notebook from %s" % path)
# load the notebook object
with io.open(path, 'r', encoding='utf-8') as f:
nb = read(f, 4)
# create the module and add it to sys.modules
# if name in sys.modules:
# return sys.modules[name]
mod = types.ModuleType(fullname)
mod.__file__ = path
mod.__loader__ = self
mod.__dict__['get_ipython'] = get_ipython
sys.modules[fullname] = mod
# extra work to ensure that magics that would affect the user_ns
# actually affect the notebook module's ns
save_user_ns = self.shell.user_ns
self.shell.user_ns = mod.__dict__
try:
for cell in nb.cells:
if cell.cell_type == 'code':
# transform the input to executable Python
code = self.shell.input_transformer_manager.transform_cell(cell.source)
# run the code in themodule
exec(code, mod.__dict__)
finally:
self.shell.user_ns = save_user_ns
return mod
sys.meta_path.append(NotebookFinder())
2、在主函数中import Ipynb_importer
import matplotlib.pyplot as plt
import Ipynb_importer
from random_walk import RandomWalk
rw = RandomWalk()
rw.fill_walk()
plt.scatter(rw.x_values, rw.y_values, s=15)
plt.show()
3、运行主函数,调用成功
ps:random_walk.jpynb文件内容如下:
from random import choice
class RandomWalk():
def __init__(self, num_points=5000):
self.num_points = num_points
self.x_values = [0]
self.y_values = [0]
def fill_walk(self):
while len(self.x_values) < self.num_points:
x_direction = choice([1,-1])
x_distance = choice([0,1,2,3,4])
x_step = x_direction * x_distance
y_direction = choice([1,-1])
y_distance = choice([0,1,2,3,4])
y_step = y_direction * y_distance
if x_step == 0 and y_step == 0:
continue
next_x = self.x_values[-1] + x_step
next_y = self.y_values[-1] + y_step
self.x_values.append(next_x)
self.y_values.append(next_y)
运行结果:
以上这篇浅谈Python中re.match()和re.search()的使用及区别就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持python博客。