https://github.com/lemonhu/finance-qa-spider

金融问答平台文本数据采集/爬取，数据源涉及上交所，深交所，全景网及新浪股吧
https://github.com/lemonhu/finance-qa-spider

deprecated-repo finance python2 question-answering scrapy-crawler

Last synced: 7 months ago
JSON representation

金融问答平台文本数据采集/爬取，数据源涉及上交所，深交所，全景网及新浪股吧

Host: GitHub
URL: https://github.com/lemonhu/finance-qa-spider
Owner: lemonhu
Created: 2017-08-20T06:59:16.000Z (almost 9 years ago)
Default Branch: master
Last Pushed: 2017-08-20T07:16:50.000Z (almost 9 years ago)
Last Synced: 2025-06-29T21:02:57.222Z (11 months ago)
Topics: deprecated-repo, finance, python2, question-answering, scrapy-crawler
Language: Python
Homepage:
Size: 10.7 KB
Stars: 39
Watchers: 2
Forks: 9
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # 基于Scrapy框架的金融问答文本数据库建设

---

## 开发语言

Python

## 开发平台

Eclipse+Pydev

## 数据来源

1. 上交所官方平台的问答系统  

   http://sns.sseinfo.com/qa.do  

2. 深交所官方平台的问答系统  

   http://irm.cninfo.com.cn/szse/index.html　　

3. 全景网投资者关系互动平台  

   http://rs.p5w.net/index/company/showQuestionPage.shtml　　

4. 新浪股吧  

   http://guba.sina.com.cn/?s=channel&chi

## 数据库表shse_qa

	mysql> CREATE TABLE IF NOT EXISTS `shse_qa`(

	    -> `current_time` TIMESTAMP NOT NULL,

	    -> `user_name` VARCHAR(100) NOT NULL,

	    -> `company_name` VARCHAR(100) NOT NULL,

	    -> `company_id` int(20) NOT NULL,

	    -> `question_time` VARCHAR(100) NOT NULL,

	    -> `question_content` text NOT NULL,

	    -> `answer_time` VARCHAR(100),

	    -> `answer_content` text

	    -> )ENGINE=InnoDB DEFAULT CHARSET=utf8;

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lemonhu/finance-qa-spider

Awesome Lists containing this project

README