Resources
Below are some datasets and software packages that might be useful for your term projects. If you find additional resources you think are intersting, let me know and I will add them to the list!Data Sets
University
of California Irvine Machine Learning
Repository
228 datasets across a variety of domains. A great
resource!
Amazon
Web Services: Public Datasets
A repository of 54 public datasets, including census
data, Wikipedia web traffic, genome data, and the
Enron emails.
International
Monetary Fund: Data and Statistics
Time series datasets covering macroeconomic and
financial indicators, such as IMF lending, exhange
rates, etc.
Penn
State Event Data Project
Political and conflict event data extracted from
English-language news sources focusing on events in
the Middle East, Balkans, and West Africa.
World Bank: Data Catalog
Data on over 8,000 attributes (many of them
macro-level economic, social, or governance indicators) from World Bank
datasets across 200 countries.
University
of California Berkeley: Statistical/Data
Resources
Repository of datasets regarding
public health issues maintained by the Sheldon
Margen Public Health Library
University
of Texas: Policy Agendas Project
Data on public policy development in the U.S., including Congressional bills, Presidential Executive
Orders, public opinion polls, and federal budget
reports.
Software Packages
Weka
Collection of data mining and machine learning algorithms in
Java.
PyLog
First-order logic library for Python.
Jahmm
Hidden Markov Model implementation in Java.
GHMM
Hidden Markov Model implementation in C and Python.
ACL2s
Integrated modeling, simulation, and inductive reasoning system.