Data Visualization - Self-Paced Work

Follow these instructions to practice your data visualization skills! You are also welcome to explore more on your own, and see what you can create. We recommend continuing in a Google Sheet or a Google Colab notebook. Explore one of the datasets below!

Note: If desired, you can also feel free to check out the "Additional Explorations" at the bottom of the homepage.

Candy Power Rankings

This dataset compares different types of candy to each other.

Resources

Data Dictionary

  • chocolate: Does it contain chocolate?
  • fruity: Is it fruit flavored?
  • caramel: Is there caramel in the candy?
  • peanutalmondy: Does it contain peanuts, peanut butter or almonds?
  • nougat: Does it contain nougat?
  • crispedricewafer: Does it contain crisped rice, wafers, or a cookie component?
  • hard: Is it a hard candy?
  • bar: Is it a candy bar?
  • pluribus: Is it one of many candies in a bag or box?
  • sugarpercent: The percentile of sugar it falls under within the data set.
  • pricepercent: The unit price percentile compared to the rest of the set.
  • winpercent: The overall win percentage according to 269,000 matchups.

Sources

Video Game Systems

This dataset has average IGN Review scores by platform and genre.

Resources

Data Dictionary

  • Platform: The video game console for the games
  • Genre (Adventure, Puzzle, Racing, etc): The genre for the games

For example, on the Nintendo DS row, the Fighting column has a value of 6.32. This means that of all the fighting games for the Nintendo DS, the average IGN rating (out of 10) is 6.32!

Sources

NBA Game Data

This dataset contains records for NBA games from 1946-2015. It includes a lot of Elo rating data.

Resources

Data Dictionary

  • gameorder: Play order of game in NBA history
  • game_id: Unique ID for each game
  • lg_id: Which league the game was played in
  • _iscopy: Each row of data is tied to a single team for a single game, so _iscopy flags if this game_id has already occurred for the opposing team in the same matchup
  • year_id: Season id, named based on year in which the season ended
  • date_game: Game date
  • is_playoffs: Flag for playoff games
  • team_id: Three letter code for team name, from Basketball Reference
  • fran_id: Franchise id. Multiple team_ids can fall under the same fran_id due to name changes or moves. Interactive is grouped by fran_id.
  • pts: Points scored by team
  • elo_i: Team elo entering the game
  • elo_n: Team elo following the game
  • win_equiv: Equivalent number of wins in a 82-game season for a team of elo_n quality
  • opp_id: Team id of opponent
  • opp_fran: Franchise id of opponent
  • opp_pts: Points scored by opponent
  • opp_elo_i: Opponent elo entering the game
  • opp_elo_n: Opponent elo following the game
  • game_location: Home (H), away (A), or neutral (N)
  • game_result: Win or loss for team in the team_id column
  • forecast: Elo-based chances of winning for the team in the team_id column, based on elo ratings and game location
  • notes: Additional information

Sources

NFL Players Data

This dataset contains information about current NFL players.

Resources

Data Dictionary

  • nflId: Player identification number, unique across players (numeric)
  • height: Player height (numeric)
  • weight: Player weight (numeric)
  • birthDate: Date of birth (M/D/YYYY)
  • collegeName: Player college (text)
  • position: Player position (text)
  • displayName: Player name (text)

Sources

Titanic Passenger Data

This dataset contains information about passengers on the Titanic.

Resources

Data Dictionary

  • Survived: Whether the passenger survived or not
  • Pclass: Passenger ticket class (first, second, or third)
  • Name: Passenger's name
  • Sex: Passenger's sex
  • Age: Passenger's age
  • Siblings/Spouses Aboard: Number of siblings/spouses also aboard the Titanic
  • Parents/Children Aboard: Number of parents/children also aboard the Titanic
  • Fare: Amount paid for ticket (in pounds)

Sources

results matching ""

    No results matching ""