Prior DIL Workshops and Resources
DIL staff have held workshops on the following subjects, among others. Materials from these workshops are available upon request.
- Introduction to R: A workshop intended for those totally new to the R analysis system. Learn to use R for your own projects in the sciences and digital humanities.
- SPSS Review: Learn to use the IBM SPSS point-and-click interface for data import, descriptive statistics, and fundamental visualizations.
- SAS Review: Learn to code basic SAS procedures for data import, descriptive statistics, and fundamental visualizations.
- Easy Data Visualization with Plot.lyL Learn Plot.ly’s easy `plug and play’ system for data visualization; workshop will replicate graphics from The New York Times’ The Upshot”, from data collection to visualization.
- Social Media Data Collection, Visualization, and Analysis: Learn easy R tools for sampling Twitter streams. An overview of other social media tools.
- Text as Data: Sentiment Analysis: An introduction to lexicon (dictionary) based analyses of text affect and tone; applications will include social media posts, open-end survey responses, and literary texts.
- SPSS: Computing and Recoding Variables, Chart Editor: Learn to use the SPSS Chart Editor for data visualization and Transform capabilities for data management.
- SPSS: Simple and Multiple Linear Regression: Learn tools for estimating, interpreting, and diagnosing problems with simple and multiple linear regression
- Text as Data: Stylometry: An introduction to the analysis of countable linguistic features of text documents, such as word and letter n-grams, to reveal differences in authorship and style.
R Resources
Social Media Data Analysis with R
Please download the overview of the workshop.
And then download the R script file, socialmediadata.txt. Save it to a working directory with the file extension '.R', rather than '.txt'.
Open the script file in RStudio.
Introduction to R workshop materials
First, please see the handout with learning objectives and curated R resources. Also see an introductory R Guide, which explains installation of R in greater detail.
There are two R script files to download. The first is here, R intro workshop script file.txt. The second is here, World Bank data and Googlevis.txt. Download these files to a directory of your choice.
Second, please see this handout of comprehension check exercises.
Another dataset we may use: Boston housing characteristics. While saved on this webpage as a .txt file, you should rename it with the extension .csv while saving it to your directory.
Video guides to R
The Odum Institute, at the University of North Carolina at Chapel Hill maintains a set of introductory videos on installing and using R for the first time.
Google developers produced a series of YouTube videos on R, ranging from introductory to advanced usage.
Online R Classes
Several MOOCS use R as software for learning statistics or data analysis. A few that are good for an introduction: 1) Data Camp's free course, Introduction to R. 2) Microsoft's Introduction to R Programming, and 3) Coursera's R Programming.
R related blogs
R-bloggers aggregates current blogs on all things related to R. The site is a popular source for tutorials and examples of using R in almost every subject area imaginable. It maintains an exhaustive list of resources for learning R from scratch.
Books on R at GVSU
The GVSU library subscribes to SpringerLink, which publishes an extensive series of books on R in their "Use R!" series. Search for this series in the SpringerLink database.
Other R help resources
The Institute for Digital Research and Education at UCLA maintains an extensive set of notes and FAQs on using R. A set of introductory slides is useful for getting started.
SAS Resources
The UCLA Institute for Digital Research and Education hosts several introductory SAS resources.
Lex Jansen's SAS pages are the go-to source for SAS papers.
SPSS Resources
The GVSU Statistics Tutoring Center maintains several locally produced resources for learning SPSS, including Dr. John Gabrosek's SPSS Manual for Statistics 215, Introductory Applied Statistics.
The UCLA Institute for Digital Research and Education maintains an extensive set of notes on learning SPSS.
Three items for the DIL SPSS Re-fresher Workshops: SPSS Workshop Notes.docx, Workshop data file.xlsx, and Workshop results.sav.
Data Visualization
VISUALIZATION LINKS
For free, point-and-click data visualization, try the following:
Plotly: Create a wide range of static graphic visualizations, or interactive graphics for sharing online. Allows 'forking' and social editing of public visualizations. One private visualization is free; multiple private visualizations require a paid subscription. Check out the tutorials at https://help.plot.ly/
Datawrapper: Create univariate or time series visualizations from common types. Datawrapper is used by news media outlets, such as the UK's The Guardian; readers will notice a familiar look and feel. Unlimited static charts can be created for free, but exporting charts to anything other than PDF form requires a paid subscription.
Infogram: Upload your data, create, and share an infographic online.
Raw: Create web-based visualizations from a gallery of common chart types, all without uploading your data anywhere. Visualizations are created in your web browser without moving data off your local data storage. Visualizations can be exported as graphics files or shared online via HTML code.
Tableau: An extensive software application for data management, analysis, and visualization. Download the 'public' version of Tableau to your desktop computer.
WORKSHOP MATERIALS
Data:
- (CSV format) Irises.txt . Anderson's 1936(?) Irises data. Measurements in centimeters of Sepal Length, Sepal Width, Petal Length, and Petal Width in centimeters, and Iris species name.
- (CSV format) developmentdata.txt. World Bank data on countries around the world in 2010, consisting of GDP per capita (constant 2000 US$), fertility rate (total births per woman), and carbon dioxide emissions (metric tons of CO2 per capita), and a few others.
- (CSV format) painters.txt Data from eighteenth century art critic, de Piles, of 54 classical painters assessed on four characteristics: composition, drawing, color and expression. Each measure is an integer ranging from 0 to 20. Source: Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
- (CSV format) gvsusocialsciencemajors.txt. GVSU Institutional Analysis counts of majors from 1999 to 2015 in Political Science and related fields.
- (CSV format) speedski.txt. Results from 2011 World Speed Skiing Championships, Verbier, Switzerland.
- (CSV format) bostonhousing.txt Various characteristics of Boston, Massachusetts housing markets. Described here.
Please note: There is a file type conflict between GVSU and infogr.am. The data files are saved as .txt files (GVSU does not allow the .csv file extension), but infogr.am expects csv files to have a .csv file extension and not .txt! So you'll need to rename these datafiles before uploading to infogr.am. The .txt file extension works fine on the other sites.
Text Mining
From the Data and the Digital World Symposium workshop "Text to Data to Insight"
Make sure you bring along a laptop with a copy of R installed. And if you are on a Mac, you'll need the most recent copy of XQuartz installed as well.
Regardless of your operating system, just follow these instructions:
1. Download R at https://cran.r-project.org/ . (Don't worry if you've never used it before! We'll talk you through it.) Install it as you would any other piece of software. If you need help with installing R, please see the introductory R Guide. And if you encounter any trouble, contact Whitt Kilburn, [email protected].
2. If you are on a Mac, you also need to have the most recent version of XQuartz installed on your system, too. Doing so takes only a few minutes, but is required before you attempt to use R for our purposes. Download it and install it at https://www.xquartz.org/ . If you are on a Windows machine, you can ignore step 2.
For the workshop we will use as sources of data, a variety of text documents. These documents should be downloaded to your laptop from this link: text to data to insight documents.zip. The .zip folder should contain 4 subfolders, a collection of British fiction, The Federalist papers, works of Shakespeare, and recent inaugural addresses of U.S. presidents.
An R script file will be available at the start of the workshop on Wednesday. An almost final draft, or penultimate script file is here. Please save this version to your laptop as well. During the workshop, you'll occasionally need wireless access to download some additional material referenced in the script file. So please make sure you can connect to the GVSU wireless network.
Handouts: learning objectives, agenda, and brief guide to using the Stylo() package. And some PDF slides, plus library resources.