University of Chicago SUPERgroup

Usable Security and Privacy
Problem Set 5

Due on Canvas at 1:00pm on Monday, May 14th 1:00pm on Tuesday, May 15th.

Problem 1 (40 points)

As platforms like Samsung's SmartThings have brought an app-ified Internet of Things (IoT) to consumers, concerns have been raised about whether consumers are truly being notified about the privacy risks of having Internet-connected devices in their home. Thus, we want you to design a privacy notice for Internet of Things apps. You should turn in:

  1. One paragraph describing what medium (screen on a smart phone, spoken notification from something like the Amazon Echo, paper notices, etc.) you have chosen for delivering this privacy notice, and why.
  2. One paragraph describing what details you believe to be most important for an IoT app privacy notice to communicate, and why.
  3. One paragraph describing the decisions you made in designing your privacy notice.
  4. Sketches of your notice that you will use for a paper prototype. For more information on creating a paper prototype, please read this article. Note that you should have examples of all major screens or displays that a user would see. You can create sketches on paper (but turn them into PDFs), in Powerpoint, in rapid-prototyping software, etc.

Problem 2 (60 points)

(This problem involves programming. If, and only if, you do not have a programming background, you may complete our alterate asignment for Problem 2.)

Twitter provides an API to collect data posted on Twitter. The Twitter API allows you to get a real-time, random sample of all tweets containing a set of keywords being posted on Twitter.

Utilize the Twitter API to collect all the tweets that was posted about information security and information privacy in real time for 8 hours. Note that you need to create a Twitter developer account for the data collection. You will have to choose your keywords carefully so that you obtain sufficiently relevant data from the API. Write code to filter out non-English tweets from your collection.

Create a word cloud from the filtered text of your tweets after removing all stop words, punctuation, and user mentions (tokens starting with "@").

Finally, we would like to know what are the most prominent information security and privacy concerns that Twitter users talked about during your data collection. To that end, randomly sample 50 English tweets from your collected set and manually divide them into at most 6 thematic categories representing information privacy and security issues. Turn in the following:

  1. All code you wrote (which can be in any programming language)
  2. The list of keywords you chose to use to obtain your tweets
  3. The time of tweet collection, the number of unique (English) tweets you collected, the number of unique users who posted those tweets, and a random sample of 10 (English) tweets from your set.
  4. Include the word cloud you generated.
  5. List the top ten keywords from the word cloud and write a paragraph about how these keywords are linked to information security and privacy.
  6. Turn in a table describing the thematic categories you manually created, the number of tweets (out of 50) in each theme, and an example tweet for each.
Items 2-6 should be submitted as a single PDF document.

(CMSC 33210 only!) Problem 3 (0 points; -60 points if not completed)

Write 3-7 sentence summaries and short "highlights" for the Englehardt and Narayanan reading assigned for May 2nd, the Wang et al. reading assigned for May 7th, the Tramer et al. reading assigned for May 9th, and the Miramirkhani et al. reading assigned for May 14th.