文章自动采集自动发布( Python+Selenium自动发布文章系列-2018-05-181319)

优采云发布时间: 2022-04-11 11:46

　　文章自动采集自动发布(

Python+Selenium自动发布文章系列-2018-05-181319)

　　Python + Selenium 自动发布文章 (二）: 简书

　　2018-05-181319

　　简介：Python+Selenium自动发布文章系列：Python+Selenium自动发布文章（一）：开源中国Python+Selenium自动发布文章（二） : 简书Python + Selenium 自动发布文章 (三）: CSDNPython + Selenium 自动发布文章 (四）: 添加bat脚本写在本文开头介绍自动使用Python+Selenium 发布简书文章，一些必要的前期准备说明在上一篇文章文章中有提及，这里不再赘述。

　　+关注继续观看

　　Python + Selenium 自动发布文章系列：

　　Python + Selenium 自动发布文章（一）：开源中国

　　Python + Selenium 自动发布文章 (二）: 简书

　　Python + Selenium 自动发布文章 (三）: CSDN

　　Python + Selenium 自动发布文章（四）：添加bat脚本

　　写在开头

　　本文介绍了使用Python+Selenium自动发布短书文章，一些必要的前期准备说明在上一篇文章文章中有所提及，此处不再赘述。

　　使用说明

　　同样的，还是需要简单分析一下写博客的界面（记得把默认编辑器设置为Markdown）。

　　写博客-简书

　　从上图可以看出，用简写博客，需要依次选择类别（也就是文集），新建一个文章，然后填写标题和内容。

　　结合auto.md的内容进行分析，得到title并在title处定义；body内容也是通过匹配-->\n得到的。其余的类别，根据规则已在注解（self_category）中定义。

　　代码说明

　　main.py：程序入口类，主要负责Markdown的正则匹配解析和调用post发布文章

　　import re

import jianshu

import linecache

class Main(object):

# init

def __init__(self, file):

self.title = ''

self.content = ''

self.category = ''

self.tags = ''

# OsChina的系统分类, 设个默认值

self.osChina_sys_category = '编程语言'

# CSDN的文章分类, 设个默认值

self.csdn_article_category = '原创'

# CSDN的博客分类, 设个默认值

self.csdn_blog_category = '后端'

self.read_file(file)

# 读取MD中的title, content, self_category, self_tags, osChina_sys_category, csdn_article_category, csdn_blog_category

def read_file(self, markdown_file):

self.title = linecache.getline(markdown_file, 2).split('title: ')[1].strip('\n')

with open(markdown_file, 'r', encoding='UTF-8') as f:

self.content = f.read().split('-->\n')[1]

# 重置文件指针偏移量

f.seek(0)

for line in f.readlines():

if re.search('self_category: ', line) is not None:

self.category = line.split('self_category: ')[1].strip('\n')

elif re.search('self_tags: ', line) is not None:

self.tags = line.split('self_tags: ')[1].strip('\n')

elif re.search('osChina_sys_category: ', line) is not None:

self.osChina_sys_category = line.split('osChina_sys_category: ')[1].strip('\n')

elif re.search('csdn_article_category: ', line) is not None:

self.csdn_article_category = line.split('csdn_article_category: ')[1].strip('\n')

elif re.search('csdn_blog_category: ', line) is not None:

self.csdn_blog_category = line.split('csdn_blog_category: ')[1].strip('\n')

if __name__ == '__main__':

md_file = 'auto.md'

print("Markdown File is ", md_file)

timeout = 10

main = Main(md_file)

# 简书

jian_shu = jianshu.JianShu()

jian_shu.post(main, timeout)

　　authorize.py：目前只实现了用qq授权登录的方法

　　jianshu.py：这是简书自动写（发）博的核心类

　　import time

import authorize

from selenium import webdriver

from selenium.webdriver.support.wait import WebDriverWait

# 简书

class JianShu(object):

@staticmethod

def post(main, timeout, self_timeout=3):

# 1.跳转登陆

login = 'https://www.jianshu.com/sign_in'

driver = webdriver.Chrome()

driver.get(login)

# 2.窗口最大化

driver.maximize_window()

# 3.使用QQ授权登录

driver.find_element_by_xpath('/html/body/div[1]/div[2]/div/div/ul/li[3]/a/i').click()

driver.close()

authorize.qq(driver, timeout)

# 4.点击"写文章"

write_blog = WebDriverWait(driver, timeout).until(lambda d: d.find_element_by_xpath('/html/body/nav/div/a[2]'))

write_blog.click()

driver.close()

window_handles = driver.window_handles

driver.switch_to.window(window_handles[-1])

# 5.点击指定分类

classify = WebDriverWait(driver, timeout).until(lambda d: d.find_elements_by_class_name('_3DM7w'))

for c in classify:

html = c.get_attribute('innerHTML')

if main.category in html:

c.click()

else:

# TODO 如果分类不存在，还可以直接新建分类

pass

# 6.点击'新建文章'

time.sleep(self_timeout)

new_article = WebDriverWait(driver, timeout).until(

lambda d: d.find_element_by_xpath('//*[@id="root"]/div/div[2]/div[1]/div/div/div/div[1]/i'))

new_article.click()

article = WebDriverWait(driver, timeout).until(

lambda d: d.find_element_by_xpath('//*[@id="root"]/div/div[2]/div[1]/div/div/div/ul/li[1]'))

article.click()

# 7.填写标题, 内容

time.sleep(self_timeout)

title = driver.find_element_by_class_name('_24i7u')

title.clear()

title.send_keys(main.title)

content = driver.find_element_by_id('arthur-editor')

content.clear()

content.send_keys(main.content)

# 8.保存草稿

driver.find_element_by_xpath('//*[@id="root"]/div/div[2]/div[2]/div/div/div/div/ul/li[8]/a').click()

# 8.发布文章

# driver.find_element_by_xpath('//*[@id="root"]/div/div[2]/div[2]/div/div/div/div/ul/li[1]/a').click()

　　其实简书也支持账号密码登录，可惜这种登录方式还有文字验证层，感觉比较难，还没研究怎么解决，所以先用QQ授权登录。

　　运行结果

　　让我们看一下正在运行的效果图。这里的测试是保存草稿。

　　自动后简书

　　写在最后

　　简书自动写文章的思路大概就是这样，而且这不是唯一的办法。可以自己根据代码进行调整，网页的结构也有可能发生变化，所以不能保证程序总能正常运行。最后，下一篇继续介绍如何在CSDN上自动写（发送）文章。

0

2022-04-11

文章自动采集自动发布

0 个评论

要回复文章请先登录或注册

AI时代内容工厂

文章自动采集自动发布( Python+Selenium自动发布文章系列-2018-05-181319)

0 个评论

发起人

AI时代内容工厂

文章自动采集自动发布( Python+Selenium自动发布文章系列-2018-05-181319)

0 个评论

发起人

相关问题