Run all the regressions for the questions below in Python language, to be implemented in the Jupyter Notebooks environment
student submitted image, transcription available below
Use only observations from shows in mainland China.
Question 1. The variable av_tweets denotes the average number of tweets associated with an episode of each show (outside of the censored time period). Therefore, this variable is show specific, but it does not vary over time. We can use this variable to capture the general level of social media interest in each show.
Generate a set of three dummy variables based on the av_tweets variable where:
The first dummy is equal to one for shows with fewer than 5 tweets per episode,
the second dummy is equal to one for shows with at least 5 but less than 100 tweets per episode and,
the third dummy should be equal to one for shows with at least 100 tweets per episode.
Question 1a) i) In Python, run three separate regressions for:
shows with less than 5 tweets per episode,
shows with 5 to 100 tweets per episode and
shows with at least 100 tweets.
Question 1a) ii) What do you find in terms of impact of the censorship event across the three regressions?
Question 1b) i) In Python, run a difference-in-difference regression that allows for the censorship event to have a different effect for three sets of shows with the three different activity levels defined above.
Question 1b) ii) Interpret the relevant coefficients.
Question 1c) i) Relate your findings across shows with different activity levels to the geographic difference-in-difference approach.
Question 1c) ii) Which regression is more informative regarding the impact of the censorship on ratings?