Topics:

Wednesday, July 17, 2013

Fairy Tales, As Told By Statisticians (#5)


I’ve been told that statistics is about conveying a maximum of information as quickly and simply as possible.  CEOs don’t have time for your flowery language! So I wondered, how would a data scientist squish classic stories into an snappy, digestible visuals?  In charts of course!  Here are a couple for your amusement:

Tortoise and the Hare


Goldilocks and the Three Bears

In a real presentation, I would probably make the story as obvious as possible.  But on this post, it's more fun to piece a puzzle together.
The Ugly Duckling
It ain't easy being at the end of the bell curve.

In this one I tried to get fancy, and probably wrecked my one rule of "quick and simple" stories.  Oh well, it’s funnier this way.


What's the point?  Well, I made this as a response to my boss telling me to play around with displaying data in different ways, and thought I'd make it fun for myself.  

But really, when we visualize data, we shouldn't have to sacrifice clarity to make graphs enjoyable to absorb.  Why not place Abe Lincoln next to our graphs if it's part of the graph's main point?  Why not color code our bars and distributions to hint at what they represent?  No legend is needed when your bar denoting bricks is made of bricks.  This could increase the visual's "at-a-glance" or reminder value, as long as you don't get carried away.  I digress, but the point is: how do you present information to other people in a way that is both creative and clear?

-Jordan 

Wednesday, July 10, 2013

Apps: The Time-Saving Paradox (#4)




I have Evernote, SMemo, Any.do, Stickynotes, Calender, and Smart Voice Recorder on the first two screens on my phone.  And I still didn’t get much work done when I came home last night.  Actually, let me clarify.  I sent off some emails, put the rest in archived folders, and deleting extraneous downloads and setup files on my computer.  On the way home, I put deep thought into which apps I use most to move to the front page (Flipboard or Snapchat? Definitely Flipboard), and, within that page, moved Facebook and Music Player (the top picks, of course) to the place where my thumb most naturally hovers.  So, I did no real work after work yesterday.

That day was a bit extreme, but not so unusual that I failed to realize a trap I fall into, that I think much of the young professional world is drifting toward.  We are obsessed with saving time to the point that we don’t care if we’re taking seconds to shave off milliseconds. We feel like we have no spare time…so we use it poorly, as a reflex.  This extends from our myriad task management strategies (I’ve seen friends write on themselves, carry around a whiteboard in their bags, put sticky notes everywhere, and worse).  Seriously, the hundreds of hours spent designing and integrating weather widgets is probably greater than the time people save vs. touching the app (…or sticking their heads out the window).  I believed I've reached the limit of time-saving apps.

What I mean is, apps are great.  So is the internet.  So is device syncing and streaming from the cloud and SD cards that can store a million articles for you to read later.  But none of these save as much time as goal-setting and focus.  If I had just closed every other window last night, I would have gotten through those stat software tutorials.  I don’t need the Self Control App, I just need self control.

I made a rough list to set priorities, to constrain my options as the tech world insists on giving me more every year.  I won’t tell you how to live your life, but I do better when I follow rules like these:

1. Do not let yourself call organizing work.  That’s a copout.
2. Unplug at some point every day, not because cell phones are giving you cancer or carpal tunnel, but because you’re human, and it’s alright that you can’t resist your devices.
3.  Don’t allow more than ~5 tasks to accumulate on a note app each day (put the low priority ones somewhere else), and aim to complete just over half of them.  You don’t need the stress of clearing a self-memo, or an inbox, every single day.  In fact, just delete that one that will take a month right now.


Keep it reasonable, organize it once, and Nike.

Tuesday, July 2, 2013

Big Data and Big Government: The Big Difference (#3)

I’d like to talk about a common argument of those who play down the recent developments involving government surveillance.
                
[In case you lost track: here and here is a good start]

I want to be a researcher and analyst in the world of big data, who collects information from consumers to better predict what they like, buy, click on, etc.  I am also outraged that the federal government has been collecting digital information on a massive scale.  The retort among some is, “if you like Facebook collecting my data, why can’t the government?”
               
My first reaction is that this seems like a reasonable “gotcha”.  I must hate government and forgive the evils of big business, right?  Well, neither is true.  I don’t hate many of government’s services, but I do honestly fear its potential to abuse its unique power over American people and institutions.  I’ll resist (with great effort) making this post a rant about political ideology or philosophy, and instead just explain the big differences between a company collecting user data and a government collecting company data.
          
Let’s talk about two scenarios: worst-case and likely-case.  With a corporation, the likely-case is that companies give you ads, recommendations, and new products so well targeted it borders on creepy.  Hardware and software makers will collect data on everything you do on their device or app or site, from to location patterns to app usage to web browsing and search history. 

But what about the worst-case scenario?  Well, it’s pretty similar to what I just described. Big data firms in 2013 are doing basically everything they can, within the confines of the law, to quantify your behavior for analysis.  In the worst-case, companies try to sell your data among themselves, even to foreign governments or criminals, or try to defraud you themselves, in which case you sue or tell a police officer or get the government involved in some way.  Data privacy laws are new and evolving (and the topic of a future note!), but I fully expect an equilibrium to be reached within the decade where companies cannot sell or use your data forever without continued and clear permission.  The point is, businesses are rightly constrained by laws, competition (if you’re being too shady with your snooping I’ll use the other service) and their need to work under agreed contracts.

What of the state?  I hardly need to share every totalitarian state, from the U.S.S.R to North Korea to those in the middle east, is built on propaganda and suppression of dissent.  The worst-case scenario with government is a gradual transition towards silencing or otherwise harassing anyone who voices their disagreement.  Say you’re an activist for the minority party, a businessman in an unpopular industry, or a journalist merely reporting all sides.  The government might pore through its treasure trove of information, which can "collected inadvertently" and "stored indefinitely", to find something to either imprison you, or bury you under a stream of lawsuits.  It could bankrupt you or your business with harsher regulations for having the "wrong" religious views or political views, and even blackmail you (a fascinating example of this is told by user “161719” here, highlighted). 

The state has no competition, and you have no ability to opt out apart from leaving the country.  Its oversight is itself, and answers to no one except frankly weak supranational organizations like the UNSC and ICC.  We are not doomed to lose our freedom under the big data state, and the government is full of generally sensible, sane and well-intentioned people.  But in the likely-case, the data from Facebook and Apple and Google, combined with census data and tax records and traffic light cameras and more, will be subject to mild but continuous misuse, whether through malevolence or mere foolishness.   And we’ll never know to object, as the entire apparatus is secret.  We’re in an age where both congressional approval (14%) and the share of Americans who trust government to “do the right thing” most of the time (24%) are at record lows, and we just received an example of the IRS using data analysis to harass conservative organizations.  Have government officials earned the right to yottabytes of personal data to mine as they see fit, a billion of times more data than Facebook? 
To conclude, big data collection without laws or other restraints is bad, whether it’s in industry or by the state.  But government collection has far surpassed the level of potential danger presented by big data companies which I want to work for, and I don’t believe we should allow pundits to discuss them on the same plane. 

-Jordan