Data Workflow Steps

########################### In Colab #####################
!pip install wrds

Tutorial 3: Example Python Data Workflow

import wrds
db = wrds.Connection()
db.close()
db = wrds.Connection(wrds_username='luojx')

Data Workflow Steps#

  1. To determine the libraries available at WRDS

  2. Example: crsp library

  3. Use Daily Stock File (dsf) dataset

  4. Preview first 100 rows, get_table() OR raw_sql()

#1
#db.list_libraries()

#db.list_libraries()     #Full list
#2
#db.list_tables('crsp')[:5]
#db.list_tables('crsp') #Full list
#3
db.describe_table('crsp', 'dsf') # Display available data variables (column names)
#4
db.get_table('crsp', 'dsf', obs=5)
db.raw_sql('select * from crsp.dsf LIMIT 5') #Alternative by SQL
  1. Work with selective variables, e.g., cusip, permno, date, bidlo, and askhi, use get_table OR raw_sql

  2. 6~9 Select data by criteria

db.get_table('crsp', 'dsf', columns=['cusip, permno, date, bidlo, askhi'], obs=5 )
db.raw_sql('select cusip, permno, date, bidlo, askhi  from crsp.dsf LIMIT 5')
#6
db.raw_sql("select cusip,permno,date,bidlo,askhi from crsp.dsf where permno in (14593, 90319, 12490, 17778) and date between '2010-01-01' and '2013-12-31' and askhi > 2000")
#7
db.raw_sql('select distinct permno from crsp.dsf where askhi > 2000')
#8
db.raw_sql("select distinct date,permno from crsp.dsf where askhi > 2000 order by date")
#9
db.raw_sql('select permno,askhi,date from crsp.dsf where askhi > 2000 order by askhi desc LIMIT 1')
### End
db.close()