from bs4 import BeautifulSoup
import requests
x = requests.get('http://www.jxufe.edu.cn/')
txt = x.text
soup = BeautifulSoup(txt)
txt_text = soup.get_text()
import re
y = re.sub('\n+', '\n',txt_text)
print(y)
运用requests得到网页代码,利用BeautifulSoup提取文本,再引入正则表达式将多余的空行删去。