Overview

This library holds data about over Broadway shows, grouped over weeklong periods. Only shows that reported capacity were included, so the dataset stretches back to the 1990s. The dataset is made available by the Broadway League (the national trade association for the Broadway industry), and you can view the data online at http://www.broadwayleague.com/. This dataset

Explore Structure




Index Type Example Value
0 dict { }
... ... ...
Key Type Example Value Comment
"Full" str "08/26/1990" The full date representation that this performance's week ended on in "Month/Day/Year" format.
"Day" int 26 The day of the month that this performance's week ended on.
"Month" int 8 The numeric month that this performance's week ended in (1 = January, 2 = February, etc.).
"Year" int 1990 The year that this week of performances occurred in.
Key Type Example Value Comment
"Gross" int 134456 The "Gross Gross" of this performance, or how much it made in total across the entire week. Measured in dollars.
"Performances" int 8 The number of performances that occurred this week.
"Attendance" int 5500 The total number of people who attended performances over the week.
"Capacity" int 88 The percentage of the theatre that was filled during that week.
"Gross Potential" int 0 The Gross Potential is the maximum amount an engagement can possibly earn based on calculations involving ticket prices, seating capacity, and the number of performances. This number is expressed here as a percentage of what could have been achieved (Gross Gross / Gross Potential). In case the GP could not be calculated, it was replaced with 0%.
Key Type Example Value Comment
"Date" dict { }
"Statistics" dict { }
"Show" dict { }
Key Type Example Value Comment
"Type" str "Play" Whether it is a "Musical", "Play", or "Special".
"Name" str "Tru" The name of the production.
"Theatre" str "Booth" The name of the theatre.

Downloads

Download all of the following files.

Usage

This library has 2 functions you can use.
import broadway
list_of_production = broadway.get_shows()
list_of_production = broadway.get_show_by_theatre("friedman")
Additionally, some of the functions can return a sample of the Big Data using an extra argument. If you use this sampled Big Data, it may be much faster. When you are sure your code is correct, you can remove the argument to use the full dataset.
import broadway
# These may be slow!
list_of_production = broadway.get_shows(test=True)
list_of_production = broadway.get_show_by_theatre("friedman", test=True)

Documentation

 broadway.get_shows(test=False)

Returns information about all the shows

 broadway.get_show_by_theatre(theatre, test=False)

Returns information about all the shows at a given theatre