Categories
Tech Takes

Amazon’s Fire Phone is incredibly smart…and what it means for the future of smartphones

The announcement of the Amazon Fire Phone is one of the most interesting technology news I’ve come across in recent times. While the jury is out on whether it will be a commercial success or not (UPDATE: recent estimates suggest it could be as low as 35,000), the features that the phone comes with got me thinking about the technical advancements that have made it possible.

The caveat here is much of what follows is speculation – but I do have a background in research projects in speech recognition and computer vision related user experience research. I’m going to dive into why Fire Phone’s features are an exciting advance in computing, what it means for the future of phones in terms of end-user experience, and a killer feature I think many other pundits are missing out.

Fire Phone’s 3D User Interface

Purkinje image
Glints off an eye are used to correlate a known position to unknown

I did my final year research project on using eye tracking on mobile user interfaces as a method of user research. The problem with many current methods of eye tracking is that it requires specialised hardware – typically the approach is to use a camera that can “see” in infrared, illuminate the user’s eye using infrared, and using the glint from the eye to track the position of the eyeball relative to the infrared light sources.

This works fabulously when the system is desktop-based. Chances are, the user is going to be within a certain range of distance from the screen, and facing it at a right angle. Since the infrared light sources are typically attached to corners of the screen – or an otherwise-known fixed distance – it’s relatively trivial to figure out the angles at which a glint is being picked up. Indeed, if you dive into research into this particular challenge in computer vision, you’ll mostly find variations of approaches on how to best use cameras in conjunction with infrared.

Visual angles
Visual angles

The drawback to this approach is that the complexity involved vastly increases when it comes to mobile platforms. To figure out the angle at which glint is being received, it’s necessary to figure out the orientation of the phone from it’s gyroscope (current position) and accelerometer (how quickly the pose of the phone is changing in the world). In addition to this, the user themselves might be facing the phone at an angle rather than facing it at a right angle, which adds another level of complexity in estimating pose. (The reason this is needed is to estimate visual angles.)

My research project’s approach was using techniques similar to a desktop-based eye tracking software called Opengazer coupled with pose estimation in mobiles to track eye gaze. Actually, before the Amazon Fire Phone there’s another phone which touted it had “eye tracking” (according to the NYT): Samsung Galaxy S IV.

I don’t actually have an Samsung Galaxy to play with – nor did the patent mentioned in the New York Times article link above show any valid results – so I’m basing my guesses on demo videos. Using current computer vision software, given the proper lighting conditions, it’s easy to figure out whether the “pose” of a user’s head has changed: instead of a big, clean circular eyeball, you can figure out there’s an oblong eyeball instead which suggests the user has tilted their head up or down. (The “tilt device” option for Samsung’s Eye Scroll, on the other hand, isn’t eye tracking at all as it’s just using the accelerometer / gyroscope to figure out the device is being tilted.)

What I don’t think the Samsung Galaxy S IV can do with any accuracy is pinpoint where a user is looking at the screen beyond the “it’s changed from a face at right angle to something else”.

What makes the Fire Phone’s “3D capabilities” impressive?

Watch the demo video above of Jeff Bezos showing off the Fire Phone’s 3D capabilities. As you can see, it goes beyond the current state-of-the-art that the Galaxy S IV has – in the sense that to accurately follow and tilt the perspective based on a user’s gaze, the eye tracking has to be incredibly accurate. Specifically, instead of merely basing motion on how the device is tilted or how the user moves their head from a right angle perspective, it needs to combine device tilt pose, head tilt / pose, as well as computer vision pattern recognition to figure out the visual angles the user is looking at an object from.

Here’s where Amazon has another trick up its sleeve. Remember how I mentioned that glints off infrared light sources can be used to track eye position? Turns out that the Fire Phone uses precisely that setup – it has four front cameras, each with its own individual infrared light source to accurately estimate pose along all three axes. (And in terms of previous research, most desktop-based eye tracking systems that are considered accurate also use at least three fixed infrared light sources.)

So to recap, here’s my best guess on how Amazon is doing it’s 3D shebang:

  • Four individual cameras, each with it’s own infrared light source. Four individual image streams that need to be combined to form a 3D perspective…
  • …combined with device position in the real world, based on its gyroscope…
  • …and how quickly that world is changing based on its accelerometer

Just dealing with one image stream alone, on a mobile device, is a computationally complex problem in its own right. As hardware becomes cheaper, more smartphones include higher resolution front cameras (also, better image sensor density so that isn’t just the resolution but the quality which is better)…which in turn gives better quality images to work on…but it also creates another problem in that there’s a larger image to now process onboard a device. This is a challenge because, based on psychological research into how people tend to perceive visual objects, there’s a narrow window – within the range of 100s of milliseconds – within which a user’s gaze rests at a particular area.

On a desktop-class processor, doing all of this is a challenge. (This article is a comparison of JavaScript on ARM processors vis-a-vis desktop, but the lessons are equally valid for other, more computationally complex tasks as computer vision.) What the Amazon Fire Phone is doing is it’s combining images from four different cameras as well as its sensors to form an image of the world to change perspective…in real-time. As someone who’s studied computer vision, this is incredibly exciting advance in the field!

My best guess on how they’ve cracked this would be to use binary segmentation instead of feature extraction. That was the approach I attempted when working on my project, but I could be wrong.

Is a 3D user interface a good idea though?

Right now, based purely on the demo, it seems that the 3D interface is a gimmick that may fly well when potential customers are in a store testing out the product. It could be banking on “Wow, that’s really cool”, as Amazon’s marketing seems to be positioning itself. Personally, I felt the visual aesthetics were less 21st century and more like noughties 3D screensavers on Windows desktops.

Windows 2000 3D pipes screensaver
“Look at all that faux depth!” – An Amazon executive, probably

Every time a new user interface paradigm like Fire Phone’s Dynamic Perspective, or Leap Motion controller comes along, I’m reminded of this quote from Douglas Adams’ The Hitchhiker’s Guide To The Galaxy (emphasis mine):

For years radios had been operated by means of pressing buttons and turning  dials; then as the technology became more sophisticated the controls were made touch-sensitive – you merely had to brush the panels with your fingers; now all you had to do was wave your hand in the general direction of the components and hope. It saved a lot of muscular expenditure of course, but meant that you had to sit infuriatingly still if you wanted to keep listening to the same programme.

My fear is that in an ever-increasing arms race to wow customers with new user interfaces, companies will go too far in trying to incorporate gimmicks such as Amazon’s Dynamic Perspective or Samsung’s Eye Scroll. Do I really want my homescreen or what I’m reading to shift away if tilt my phone one way or the other, like Fire Phone does? Do I really want the page to scroll based on what angle I’m looking at the device, like Samsung does? Another companion feature on the Galaxy S IV, called Eye Pause, pauses video playback if the user looks away. Anecdotally, I can say that I often “second screen” by browsing on a different device while watching a TV show or a film…and I wouldn’t want playback to pause merely because I flick my attention between devices.

Another example of unintended consequences of new user interfaces is the Xbox One advert featuring Breaking Bad‘s Aaron Paul. Since the Xbox One comes with speech recognition technology, playing the advert on TV inadvertently turns viewers’ Xboxes on. Whoops.

What’s missing in all of the above examples is context – much like what was illustrated by Douglas Adams’ quote. In the absence of physical, explicit controls, interfaces that rely on human interaction can’t distinguish whether a user meant to change the state of a system or not. (Marco Arment talks through this “common sense” approach to tilt scrolling used in Instapaper.)

One of the things that I learnt during my research project was there’s a serious lack of usability studies for mobile devices in real-world environments. User research on how effective new user interfaces are – not just in general terms, but also at the app level – needs to dug into deeper to figure out what’s going on in the field.

In the short-term, I don’t think sub-par interfaces such as the examples I mentioned above will become mainstream, because the user experience is spotty and much less reliable. Again, this is pure conjecture because as I pointed out, there’s a lack of hard data on how users actually behave with such new technology. My worry is that if that such technologies become mainstream (they won’t right now; patents) without solving the context problem, we’ll end up in a world with hand gesture sensitive radios are common purely because “it’s newer technology, hence, it’s better”.

(On a related note: How Minority Report Trapped Us In A World of Bad Interfaces)

Fire Phone’s Firefly feature: search anything, buy anything

Photo: Ariel Zambelich/WIRED
Photo: Ariel Zambelich/WIRED

Staying on the topic of computer vision, another one of the headline features for Amazon’s Fire Phone is a feature called Firefly – which allows users to point their camera at an object and have it ready to buy. Much of the analysis around Fire Phone that I’ve read focuses on the “whale strategy” of getting high-spending users to spend even more.

While I do agree with those articles, I wanted to bring in another dimension into play by talking about the technology that makes this possible. Thus far, there has been a proliferation of “showrooming” thanks to barcode scanner apps that allow people to look up prices for items online…and so the thinking goes, a feature like Firefly which reduces friction in the process will induce that kind of behaviour further and encourage people to shop online – good for Amazon, because they get lock-in. Amazon went so far as to launch a standalone barcode scanning device called the Amazon Dash.

My hunch – from my experience of computer vision – is that the end-user experience using a barcode scanner versus the Firefly feature will be fundamentally different in terms of reliability. (Time will tell whether it’s better or worse.) Typically for a barcode scanner app:

  • Users take a macro (close-up) shot of the barcode: What this ensures is that even in poor lighting conditions, there’s a high quality picture of the object to be scanned. Pictures taken at an angle can be skewed and transformed to a flat picture before processing, and this is computationally “easy”.
  • Input formats are standardised: Whether it’s a vertical line barcode or a QR code or any of the myriad formats, there’s a limited subset of patterns that need to be recognised. Higher probability that pattern recognition algorithms can find an accurate match.

Most importantly, thanks to standardisation in retail if the barcode conforms to Universal Product Code or International Article Number (the two major standards), any lookup can be accurately matched to a unique item (“stock keeping unit” in retail terminology). Once the pattern has been identified, a text string is generated that can be quickly looked up in a standard relational database. This makes lookups accurate (was the correct item identified?) and reliable (how often is the correct item identified?).

I haven’t read any reviews yet on how accurate Amazon Firefly is since the phone has just been released. However, we can get insights into how it might work since Amazon released a feature called Flow for their iOS app that does basically the same thing. Amazon Flow’s own product page from it’s A9 search team doesn’t give much insight and I wasn’t able to find patents related to this might have been filed. I did come across a computer vision blog though that covered similar ground.

How Amazon Flow might work
How Amazon Flow might work

Now, object recognition on its own is a particularly challenging problem in computer vision, from my understanding of the research. Object recognition – of the kind Amazon Firefly would need – works great when the subset is limited to certain types of objects (e.g., identifying fighter planes against buildings) but things become murkier if the possible subset of inputs is anything that a user can take a picture of. So barring Amazon making a phenomenal advance in object recognition that could recognise anything, I knew they had to be using some clever method to short-circuit the problem.

The key takeaway from that image above is that in addition to object recognition, Amazon’s Flow app quite probably uses text recognition as well. Text recognition is simpler task because subset of possibilities is limited to the English alphabet (in the simplest example; of course, it can be expanded to include other character sets). My conjecture is that Amazon is actually using text recognition rather than object recognition; it’s extracting the text that it finds on the packaging of a product, rather that trying to figure out what an item is merely based on its shape, colour, et al. News articles on the Flow app seem to suggest this. From Gizmodo:

In my experience, it only works with things in packaging, and it works best with items that have bold, big typography.

Ars Technica tells a similar story, specifically for the Firefly feature:

When scanning objects, Firefly has the easiest time identifying boxes or other packaging with easily distinguishable logos or art. It can occasionally figure out “naked” box-less game cartridges, discs, or memory cards, but it’s usually not great at it.

If this is indeed true, then the probability is that the app is doing text recognition – and then searching for that term on the Amazon store. This leaves open the possibility that even though the Flow app / Firefly can figure out the name of an item, it won’t necessarily know the exact item stock type. Yet again, another news article seems to bear this out. From Wired:

And while being able to quickly find the price of gadgets is great, the grocery shopping experience can sometimes be hit-or-miss when it comes to portions. For example, the app finds the items scanned, but it puts the wrong size in the queue. In one instance, it offered a 128-ounce bottle of Tabasco sauce as the first pick when a five-ounce bottle was scanned.

From a computer vision perspective, this is not surprising since items with the same name but different sizes might have the same shape…and based on the distance a picture is shot at, the best guess of a system that combines text + shape search may not find an exact match in a database. It also a significantly more complex database query as it needs to compare multiple feature sets which may not necessarily be easily stored in a relational database (think NoSQL databases).

How does all of the above impact the usage of Firefly feature?

The specifics on what Amazon Firefly / Flow can and cannot do will have a significant impact on usage. If it can identify objects and items based on shape, in the absence of packing or name, then Amazon Firefly will be a game changer. It’s a hitherto unprecedented use case that can allow people to buy items they don’t even know the exact name for, and hence it creates an opportunity to buy items (from Amazon of course) which they otherwise may not have been able to buy.

If, however, Amazon Firefly can merely read bold text on a bottle / packaging, then it will remain a gimmick. Depending on how good the text recognition is, users may find no significant benefit compared to typing out the name of a product which is clearly visible.

…but even if Firefly isn’t amazing yet, Amazon has a plan

One of the fundamental problems in trying to solve computer vision problems (such as object recognition) is that researchers need datasets to test out algorithms on. Good datasets are incredibly hard to find. Often, these are created by academic research groups are thereby restricted by limited budgets in the range of pictures gathered in a database.

Even if, right now, Amazon Firefly can only do text recognition, releasing the Flow app and the Firefly feature into the wild allows them to capture a dataset of images of objects at an industrial scale. Presumably, if it identifies an object incorrectly and a user corrects it to the right object manually, this is incredibly valuable data which Amazon can then use to fine-tune their image recognition algorithms.

Google has a similar app as well called Google Goggles (Android only) which can identify a subset of images, such as famous pieces of art, landmarks, text, barcodes, etc. At the moment, Google isn’t using this specifically in a shopping context – but you can bet they will be collecting data to make those use cases better.

Dark horse feature: instant tech support through Mayday

Amazon's Mayday button provides real-time video tech support
Amazon’s Mayday button provides real-time video tech support

Among all of the reviews that I’ve read for Amazon Fire Phone in tech blogs / media, precisely one acknowledged the Mayday tech support feature as a selling point: Farhad Manjoo’s article in the New York Times. Perhaps tech beat writers who would rather figure out things themselves didn’t have much need for it, and hence skipped over it. Mayday provides real-time video chat 24 / 7 with an Amazon tech support representative, with the ability for them to control and draw on the user’s screen to demonstrate how to accomplish a task.

Terms such as “the next billion” often get thrown about for the legions of people about to upgrade to smartphones. This could be in developing countries, but also for people in developed countries looking to upgrade their phones. Instant tech support, without having to go to a physical store, would be a killer-app for many inexperienced smartphone users. (Hey, maybe people just want to serenade someone, propose marriage, or order pizza while getting tech support.)

I think that fundamentally, beyond all the gimmicky features like Dynamic Perspective, the Mayday feature is what is truly revolutionary – in terms how Amazon has been able to scale it provide a live human to talk to within 10 seconds (not just from a technical perspective, but also how to run a virtual contact centre.) Make no mistake that while Amazon may have been the first to do it at scale, this is the way the future of customer interaction lies. The technology behind Mayday could easily be offered as a white-label solution to enterprises – think “AWS for contact centres” – or in B2C to offer personalised recommendations or app-specific support.

(Also check out this analysis of how Amazon Mayday uses WebRTC infrastructure.)

What comes next from Amazon?

Amazon's famously opaque charts-without-numbers
Amazon’s famously opaque charts-without-numbers

There aren’t sales figures yet for Amazon Kindle Fire phone. Don’t hold your breath for them either: Amazon is famous for its opaque “growth” charts in all its product unveiling events. It may be able to push sales through the valuable real estate it has on Amazon.com home page – or it might fall flat. Regardless of what happens, this opacity does afford Amazon the luxury of refining its strategy in private. (Even if sells 100 units vs 10 units in the previous month, you can damn well be sure they’ll have a growth chart showing “100% increase!!!”)

One thing that’s worth bearing in mind that Jeff Bezos has demonstrated a proclivity towards the long game, rather than short-term gains. Speaking to the New York Times, he says:

We have a long history of getting started and being patient. There are a lot of assets you have to bring to bear to be able to offer a phone like this. The huge content ecosystem is one of them. The reputation for customer support is one of them. We have a lot of those pieces in place.

You can see he touches on all points I mentioned in this article: ecosystem, research, customer support. Amazon’s tenacity is not to be discounted. Even if this version one Fire Phone is a flop, they’re sure to iterate on making the technology and feature offerings better in the next version.

Personally, I’m quite excited to see how baby steps are being taken in terms of using computer vision (Fire Phone, Word Lens!), speech recognition (Siri, Cortana), and personal assistants (Google Now). Let the games begin!

Categories
Tech Takes

Exun 2013

exun 2013 headerThis weekend, I was at Exun 2013, one of Delhi’s biggest computer technology symposiums (along with Code Wars). Having been a participant at the event for many, many years it felt nice to be back as a judge at Exun and meet so many bright kids into technology.

I signed on for Exun when DPS RK Puram’s HoD, Mr Mukesh Kumar, got in touch with me a couple of weeks ago about conducting the junior quiz, senior quiz, and crossword. I expected nothing short of the best teams at this event, hence why I knew I needed to put in extra effort to ensure the tradition of Exun’s event standards were maintained.

It's worth noting that *42* teams participating in the Exun Junior Quiz prelims
It’s worth noting that *42* teams participating in the Exun Junior Quiz prelims

It was an amazing – and tiring – experience to conduct the three events, but I loved every minute of it. Raghav Khullar helped me build the question archive for all events, and Exun members / Mukesh sir helped me with organisational logistics at every stage. And with that, I present the archives for Exun 2013:

  • Junior Quiz: Prelims (PDF, ~370 KB); Finals (PDF, ~3.3 MB)
  • Crossword: Prelims (opens in a new window); Finals (opens in a new window) n.b. I’m aware of an error in the one question in the finals, where the answer should have been “SILKROAD” and not “SILKROUTE”
  • Senior Quiz: Prelims (PDF, ~680 KB); Finals (ZIP, ~6.7 MB) n.b. I had to use PPTX for the finals presentation decks because it contains embedded media.

Hope the teams enjoyed the quizzes and the crossword. Feedback appreciated! :)

This slideshow requires JavaScript.

Categories
Tech Takes

Lessons in smartphone videography: What I learnt from editing a video shot on a phone

I have been obsessed for a while now with the idea of making a short film shot entirely on a smartphone. The versatility that a phone would allow in sheer ease of organising filming schedules is what attracts me the most. There has been a significant amount of interest from amateur / professional filmmakers in the industry along similar lines.

What I did not want to do, however, is to plan a shoot, film it on a smartphone, and then end up with a substandard product. I needed a low-risk project to try out my idea on. Fortunately, the chance presented itself when I got the idea of recording my time at the university’s Graduation Ball: I would record videos of my friends talking about the first time they met me, and what they thought of me. The beauty of this plan was that due to filming times, I would get to put my smartphone camera through its paces in a multitude of lighting conditions and noise environments, and since the video was unscripted, the content of the videos need not be a “good” or a “bad” take. You can watch The First Time…At Grad Ball to see what my effort worked out as. (The second part of my The First Time project was a photo album on Facebook telling the story of the first time I met my friends and what I thought of them.)

I realise that the video itself is pretty much a vanity project. Yet, I felt genuinely happy to do this because for the first time in my life – because of psychological issues I’ve had – I actually feel connected to my friends; that I care about them as individual beings. I wanted to create something to capture the essence of those emotions that I felt. Graduation, even though I’m not graduating yet, is one time people are allowed to be sentimental.

But I digress. The point of this blog post is to document my experience of what I learnt through the process of filming and editing video on a smartphone.

The “Why Now” On Smartphone Filming

Mobile phones have been able to record videos for around six years now, so it’s interesting to note how in general there’s a lot more buzz now about using them in amateur / prosumer contexts. Part of this comes from the fact that while early “smart” mobile phones (think Symbian and their ilk) could record video, the de facto recording format was 3GP / 3G2. Typically recording at QVGA / VGA resolution, the 3GP format allowed compact filesizes necessary for storing video files in a time when phones didn’t have much on-board RAM (to process a video while recording / playback) or storage space (which was often not extensible on such early smartphones).

As you can see in this example Tom and Jerry video, at the amount of compression used on such phones, video quality was poor and often had block distortion artefacts. Furthermore, for storage saving reasons the paired audio format used with the 3GP container was AMR or low-bitrate AAC that results in distorted audio; typically recorded mono channel or doubled-stereo from mono recording.

(Never search “3GP” on YouTube. I did, to find an example, and I got pages after pages of softcore porn. Brrrr.)

Another drawback of early smartphones was that they could not record at 24 fps and above. Phones as recent as my erstwhile 2009-era Nokia 5630 XpressMusic could only record at 15 fps (as demonstrated by this sample video). Such low frame rates added to the jerky, low-quality effect of mobile videos making them unusable for anything beyond sharing and viewing on other mobile devices.

Things started to change around the launch of iPhone 3GS, which came with the capability to record video at 30 fps. Android and Symbian handsets launched around the same time could record at similar frame rates, with some able to do it at 720p and others at 480p. The seed for my desire to film a project on a phone was sowed by this attempt by Nokia in 2010, who commissioned a short film shot entirely on an N8 (starring Dev Patel of Slumdog Millionaire fame as well as Baywatch‘s Pamela Anderson).

Fast forward to 2013, when most smartphones can record 720p video at 30 fps, typically in MP4 format. Significant advances in sensor technology, video processing software, even optical image stabilisation in higher-end phones means that the resulting video shot on phones these days is of much more decent quality. Certainly workable for amateur video projects. I expect, given the rate of progress in the field these days, that such advancements will continue to trickle down to cheaper devices.

Another important aspect of any video production happens to be sound. It may not be the first thing on your mind, but good audio quality is crucial in video recording. Smartphones are getting better at this too, with multiple microphones, noise cancellation, and high-quality AAC audio recording. You can audibly hear the difference that high quality, distortion-free stereo recording can make to a video in this comparison.

The Gear

Now that I’ve covered my reasons for why I think present-generation video recording on smartphones is usable-enough to be used for video projects, I’ll move to my own experience. My video was shot using my Nokia Lumia 620, so your mileage may vary according to what phone you have.

Outdoor shot
Outdoor shot with natural lighting, good quality.

Outdoor shots with natural lighting turned out to be good, no problems there. My primary concern while I was filming was how the camera would perform in low light conditions. These turned out be quite good, surprisingly, for most cases even though they were lit using the LED flash. This is one instance where the video processing software used by your smartphone manufacturer will likely make a difference. I quite like how results on my Lumia didn’t appear to be harshly-lit, as is often the case with LED flashes. The photo gallery below has a sample of shots taken with LED flash that turned out fine.

Paradoxically, the video quality was poorer when there was ambient lighting in indoor night-time shots compared to ones where there was no ambient lighting. This, I’m guessing, is partly due to video processing done by the camera itself, and partly due to the lighting conditions. Keeping this factor in mind could be crucial for any indoor night-time shots you plan to shoot.

Video capture still night time shot, low quality.
Video capture still night-time shot, low quality.

Another important factor to keep in mind is that when recording night-time videos, the LED flash works in “lamp” mode that results in a significantly lower coverage area in terms of illumination when compared to “burst” mode used for taking stills.

Still shot showing larger illumination coverage area with burst mode flash
Still shot showing larger illumination coverage area with burst mode flash

What this means is that you may need to film closer to the subject when recording a video. Take a look at the photo gallery below for a comparison of illumination when the images used above are thresholded at the same value. (Ideally, I’d have run this comparison across the same scene with different lighting conditions, but this wasn’t a controlled experiment.)

During the filming, the video preview that I saw indicated that the quality of recording was good. Unfortunately, when I had time to play them back on my phone / laptop, a major issue was apparent: there was a lot of stutter in the video, often resulting in frozen video / audio which meant for many of the recordings, I had entire sections of speeches missing! This video sample should demonstrate the problem I’m talking about.

I discovered when investigating this issue is that this was likely caused by a bottleneck with my micro SD card. I had been using a micro SD card I bought four years ago, rated as a class 4 device. If you aren’t aware of this, micro SD cards are rated at different classes based on the read / write speeds they can maintain at a sustained rate. While a class 4 device should technically be able to handle HD video streams, in reality such cards can be much slower.

First, I checked the speed of the phone camera’s focussing / metering capabilities using Sofica’s CamSpeed benchmarking tool; no surprises to report, as it the benchmark showed it was fast enough even in scenes with object motion. I then tested the read / write speed of the micro SD card using AnTuTu benchmark, and found that the read / write speeds were abysmally low – in the range of 0.9 MB/s – rather than the rated 4 MB/s (even though it was from SanDisk, a brand-name manufacturer). Switching the photos / video storage location from my SD card to the phone’s internal memory, and then recording test videos proved that the video lag problem went away. To further test my theory, I swapped out my original SD card for a new class 10 device I bought, and again, the results proved the same thing: with the faster rated card, there were no video lag problems.

This brings up an important issue: don’t skimp on your SD card, get at least a class 10 device! Looking around on Amazon, 4-8 GB class 10 micro SD cards are not that expensive, and the performance premium from the higher class storage is totally warranted for this use case.

(It also brings up an interesting point on whether “Android is slow” can be attributed to apps running off slow SD cards, and Windows Phone’s insistence on not allowing apps to run from SD card. Could be a user experience issue. I’ll follow-up on this in a later blog post.)

The Editing

After I had logged all the videos and was ready for editing, I started with setting up a new project in my video editor of choice, Adobe Premiere Pro CS5.5. This is step where I ran into my first hitch: most mobile phones these days record video at variable bit rates (VBR), and VBR is not supported by Premiere Pro at all! The reasons why phones record in variable bit rates are clear: to save on storage space; and, more importantly, changing the frame rate of video capture to compensate for lighting conditions (higher frame rates in poorer lighting conditions, and vice versa). Premiere Pro supports only a single frame rate across one project, so importing videos shot in VBR results in audio drifting wildly out-of-sync with video.

I was quite shocked to find out this, to be honest. The iPhone is the most popular camera in the world – across all device types, including dedicated cameras – which also records videos in VBR, so to leave such a popular device unsupported makes little sense. With professional recording equipment it’s easy to lock down frame rates but it’s hard to understand why Adobe wasn’t able to devote engineering resources to sort this issue out by 2013. Their decision makes even less sense when you consider Adobe sells a stripped-down version of their professional video editing software under the Adobe Premiere Elements brand. Surely a large portion of consumers who use Elements would need to edit footage shot on phones?

Sony Vegas Pro was the only software that could process videos from my phone
Sony Vegas Pro was the only software that could process videos from my phone

Ultimately, I solved this problem by downloading a trial version of Sony Vegas Pro – which does support multiple frame rates. The settings UI is slightly clunky, but it does allow setting a frame rate you want to hit across the project, and accordingly imported videos in its timeline to ensure audio-video stays in sync. As a non-linear video editor, Sony Vegas Pro does have adequate technical capabilities (in terms of format and project-setting support), although the interface does leave a lot to be desired in comparison to Adobe Premiere Pro.

On a more stylistic note, I could have worked more on this current video that I made to filter the audio for the noisier recording backgrounds. It’s audible, but only if you have a good set of speakers. In the end, I left them without doing this – partly because I didn’t want to spend too much time on this project, and partly to impart what I felt was a more “authentic” touch to the video that fully captured the energy of situation the videos were recorded in.

Naturally, if you’re using a Mac then with iMovie / Final Cut Pro, variable bit rates are supported out of the box. But if you’re editing footage shot on a smartphone on a Windows system, bear in mind that you need to account for additional cost; in terms of acquiring new software, or training time in getting to grips with how to use a software other than you may be accustomed to. The state of affairs for Windows users in this situation is a tad disappointing.

TL;DR

  • Regardless of how capable your phone’s camera is, it won’t be able to record video “smoothly” unless you use at least a class 10 micro SD card. Invest in this.
  • Adobe Premiere Pro does not support variable bit rate video. The only solution that can handle this on Windows that I’m aware of is Sony Vegas Pro.
  • Once you have all the tools necessary in pre-and-post-production phases, smartphone video recording can afford a range of versatility that make them attractive for filming in uncontrolled settings.

Overall, I would say that I enjoyed the process of making a video on a smartphone. I taught me things to be careful about when using such gear, and hopefully, my lessons help out others thinking of similar projects.

Categories
Tech Takes

Music discovery is broken, and Spotify Radio is part of the problem

I love discovering new music to listen to, but what I don’t like is reading reviews on music blogs or magazines to find them. Reading through reviews rather than listening to discover new stuff feels so 20th century to me. Music is emotional for me – as it is with many people – and I’d rather get a feel for something new myself to find whether I like it, rather than just reading words on a page.

(After I find an artist I like, though, I do read up about their history and creative process on music blogs. That’s what they are good for: no matter how obscure an artist, you’re likely to find someone who has interviewed them.)

My go-to source for music recommendations used to be Last.fm. I still scrobble to my profile – partly for keeping track of my music tastes over time, mostly in the vain hope that some day I will be able to unlock this data in a usable form.

Last.fm Spotify app
Last.fm Spotify app

Last.fm remains a one-of-a-kind service for archiving music tastes; there simply hasn’t been any replacement over the years for a service like this. Ever since they stopped on-demand playback of tracks though (due to licensing issues; it was simply financially unviable for them to offer it) the platform has been less than worthless for new music discovery. Using Last.fm radio – while good at making automated recommendations based on listening history – is quite cumbersome to use as a standalone application. As it stands now, Last.fm’s discovery mechanism is primarily focussed around surfacing recommendations based on similar tracks / albums…which makes the process of going through all of them a chore.

***

I have been a Spotify customer for four years now (first, Spotify Unlimited, and now Spotify Premium). It’s every music fanatic’s wet dream – an almost-limitless library of songs, available on desktop, web, mobile, offline that syncs everywhere, and for the artists holding out from adding their works to their library, it’s easy to add local MP3s to your collection.

We Are Hunted's automated playlists
We Are Hunted’s automated playlists

For over a year now, I was using the We Are Hunted Spotify app to source my music recommendations from. The neat idea behind We Are Hunted was that it scoured the web for signals of what music people were listening to – rather than relying merely on charts such as Billboard – and automatically created digests of what the hottest tracks in each genre were. I loved it, because for people like me who are into indie / alternative emerging music, it surfaced finds that may not even yet be climbing on any official charts yet…but thanks to Spotify’s vast tie-ups with independent labels could still be found in its library. There were times when I discovered music so underground that they weren’t available anywhere other than Spotify, or perhaps, Soundcloud (not even YouTube!). Many of my “What I Have Been Listening To” recommendations came about this way. Every month, We Are Hunted would publish an automatically-generated playlist of the hottest music and make it available through Spotify. I looked forward to this day every month with the same kind of excitement one would for a magazine issue.

Twitter Music charts
Twitter Music charts

Then, it all ended when We Are Hunted was acquired by Twitter. Functionally, it does the same thing that it used to – in a far more cumbersome interface. Now, I need to login to Twitter Music using my Twitter and Spotify accounts, and play the tracks through that interface. There’s no easy way to export the list as a playlist any more, as the obvious intention here is to lock people in to playing music through Twitter’s own interface and / or music apps.

Herein lies the problem with every other music discovery service I have ever seen: they expect the user to play part of the curation process either by selecting ‘channels’ or ‘users’ to follow (such as Hype Machine, 8tracks, and others), or by basing recommendations on individual albums / tracks / artists / playlists (such as Last.fm and countless Spotify apps). The downside, as I see it, is that either way it requires effort on part of the user to constantly prune lists of whom to follow, or in the second scenario to refresh the recommendations manually. This is a cumbersome process! We Are Hunted was doing something extremely unique with the service they were offering. It’s a service useful enough for me that I’d pay for something like it on top of my Spotify subscription fees. (The closest replacement that I have (recently) stumbled upon is Tunigo. It’s still not a perfect replacement though for We Are Hunted as its top-of-genre lists count absolutes rather than ‘rising’ tracks like WAH used to.)

***

Spotify's new Discover feature in its web player
Spotify’s new Discover feature in its web player

The need for such a service or an app would be obviated if Spotify itself offered an equivalent. It has moved somewhat in this direction with the launch of its new Discover feature.

Spotify Radio

Apps such as We Are Hunted and Tunigo are good, but what I really want is a leanback, interaction-free music recommendation engine – much like Last.fm’s radio feature – that simply melts into the background and throws up one new track after another. Spotify does something similar with its Radio feature, on the surface, but in my experience it isn’t truly a radio service. Instead of having a native recommendation engine, Spotify integrates with The Echo Nest’s music discovery APIs to power its radio functionality – and the way this seems to work is that it generates a static playlist based on a genre, artist, track, or playlist that is queued for playback. The reason why I say this is that whenever I play Spotify radio, it seems to continue repeating the same tracks over and over again. Supposedly, using the thumbs up / down buttons is meant to help its recommendation engine ‘get better’ but all a thumbs down seems to do is to skip to next track, and the thumbs up button to add the track to a ‘Liked from radio’ playlist.

Spotify’s API primarily seems to revolve around playlists – and I can see how for people whose music experience revolves around playlists, e.g., many friends I have on Spotify, this is the perfect model of curation for them. However, for anyone – like me – who wants more robust and automated tools for an interaction-free music experience, it falls short of expectations.

My point is that without a good recommendation engine backing up its radio feature, Spotify is leaving money on the table in terms of user experience. It has such a deep library of music, and yet, discovering it is still primarily driven by last-decade methods such as collecting albums and playlists. I expected much better from a service that is supposedly the future of music.

***

UPDATE: Twitter launched a Spotify app that allows playlists to be exported. Solves many of the problems I listed here! I’m glad Twitter realised its #music feature was crippled without supporting an external ecosystem.

UPDATE 2: Twitter Music is dead.

Categories
Reviews Tech Takes

Netflix vs Lovefilm Instant: my impressions on streaming services in the UK

I am a cord cutter. I live in a house of other cord cutters. By that, I mean that we don’t own a television in our house, instead opting to watch all our video content on our computers or portable devices. The obvious advantage of this is that none of us need to pay for a TV license. While I haven’t been able to find any market research to back this up, anecdotal evidence that I know suggests to me that a sizeable number of students are like me in that they consume a majority of their video content streaming rather than on television.

The UK is different from the US television market in that a lot of the original TV content produced here is made available online quickly. There’s BBC iPlayer, Channel 4’s 4oD, and ITV Player – from the top three British broadcasters – that make their content available for free-of-cost (without a subscription or a TV license). The exception is Sky, which uses exclusivity of its content as a unique selling point for its own services.

What remains then, for the viewing needs of cord cutters in the UK is a) streaming movies b) on-demand playback of older content no longer active on the broadcasters’ own services. This is the gap that online streaming services such as Netflix or Lovefilm Instant fill in. Now, I could easily pirate the content but I these days I try to do the right thing. I buy all my books from the Kindle store rather than pirating PDFs. I subscribe to Spotify‘s premium service. I genuinely believe that content creators deserve to be compensated for their work. (Reading Paul Carr and Monday Note has convinced me that digital content needs to be paid for to create sustainable businesses which will continue to amuse and entertain us.)

So even with films and television, I want to get my content from legal sources rather than pirating them. There simply isn’t any excuse when the price for such services is so affordable: £5-6 a month is something that can comfortably fit into any student’s budget. Although I’ve been a subscriber to Spotify’s premium service for years, it’s only recently that streaming video services have launched in the UK have become mature enough to use.

I only make an exception when there’s no legal avenue at all to obtain something I’d happily pay for; that’s when I pirate. I totally understand why they don’t make it available because at the moment, they don’t want to cannibalize their existing business. Even making it available on a paid basis gives an incentive for people to cancel their satellite / cable subscription, and that revenue it far to worthy for them right now to risk it to online services which is a more price sensitive market and won’t accept higher prices. (Read this Fast Company piece on the struggle Hulu is facing in the US along these lines.)

What’s good for the customer is that most streaming service companies offer month-long or longer free trials that give you a fair amount of time to test how good their service is. This is exactly what I did. Here are my thoughts on the ones I tried out.

Lovefilm Instant

Lovefilm Instant (£4.99 / month) is Amazon’s play in streaming services in the UK. (In the US, this is branded as Amazon Instant Video.) I tried out a 45-day free trial of Lovefilm from a voucher I got along with an Amazon purchase.

My first impression of the service was that it’s a confusing mess. DVD / Blu-Ray titles are mixed in with streaming titles. The ‘Instant’ bit essentially lives on a section of the main Lovefilm site. Discovery is primarily done through ‘lists’ created within Lovefilm according to genre and lists created by users such as ‘Best of Lovefilm’ or ‘Staff picks’. This feels odd. I remember a time, many months ago, when Lovefilm also used to make films available for a payment and some included within the subscription package, so the whole ‘With Package’ section these days – when it no longer offers films on payment – feels like they tried to stuff the current titles into the old interface.

The search function does not have autocomplete. You’re flying blind as to what’s available and what’s not – or if you don’t know how to correctly spell a film title or actor / director name. The hangover of the legacy business of renting DVDs becomes quite apparent whenever you search for a title: results thrown up show a mix of DVDs as well as streaming titles. If I’m a streaming-only customer, why make things more complicated by showing me results that I cannot possibly access on my subscription plan? Perhaps this is a ploy at upselling you to their costlier plans, but for someone like me who doesn’t even have a DVD drive, this is completely pointless. While it’s possible to filter the search results according to ‘lists’ again to show only Lovefilm Instant titles, my point is a user shouldn’t have to do this extra step themselves.

Okay, so let’s say you don’t have a particular film in mind and just want to browse titles they have according to genre. So I clicked on a list, and to bring some sanity into sifting through the results, choose the option to order results according to ‘Member rating’. Here’s the problem with that: the ‘ordered’ results have no fucking relation whatsoever to the member ratings. Note how in the above screenshot the ratings go from 3 stars to 2 to 2.5 to 3 to 4. It simply doesn’t do what it does on the tin.

Search / content discovery UX is broken really badly for TV shows. Say that I search for ‘Lost’ (don’t judge me), the results are presented as individual episodes. Assuming that the show I wanted was ‘Lost On An Island’ (whatever, just roll with the example), that would mean instead of having a list of episodes on a single page for a TV show or season, I needed to click through dozens of pages of search results to find the one I want in case there are TV show titles with common words in their titles. To top this off, Lovefilm offers three criteria to sort results: ‘relevance’, member rating, date added. None of this is a particularly effective way of ordering results or a browsing interface for a TV show, where the best way is to provide a sequential list of episodes according to season. Instead (as you can see in the screenshot above), episodes according to ‘relevance’ are ordered completely randomly.

Once you find a film / TV show  you want to watch, you click through to its details page which lists a synopsis along with other related data pulled from IMDb. Clicking ‘watch now’ (sorry, the screenshot was taken after my subscription ended) starts playing the film…in that tiny embedded player window. It doesn’t look visually pleasing, and you’d most definitely need to switch to fullscreen viewing mode. The idea behind it probably is that most people would do that anyway. However, at first look, the player interface doesn’t look aesthetic.

I had trouble with the playback too. Despite having a 100 mbps broadband connection which never gives me issues, playback kept stalling and giving me ‘Content not available offline’ error messages randomly throughout my 45-day trial period. The only solution for this seemed to be to exit fullscreen mode, reload the page – at which point the player would prompt me to continue watching from where I had left – and then start playing back a couple of dozen seconds before the point playback stalled. I don’t know what the cause behind this could be.

What I found most disappointing was that Lovefilm Instant doesn’t make the process of figuring out ‘what to watch next’ easy. This really shouldn’t be that hard since Amazon already owns the best movies database on the planet – IMDb – and if it tried it could easily throw recommendations according to titles watched or rated previously by a user. Instead, it makes you browse through endless lists of various descriptions and even at that makes usage difficult by non-functional sorting. There’s no way of linking your Lovefilm account to your IMDb account (although there is an option to link Amazon and Lovefilm accounts). I cannot fathom how they can let this opportunity pass.

Legacy business hangover rears its head again when you try to browse titles according to ‘related’ content. Browsing to that tab shows titles available on DVD mixed with those available for streaming.

And here comes the subjective part: Lovefilm’s streaming library is utter shit. Its library seems to mostly consist of B-grade / C-grade films from the 90s with very few new film releases or TV shows. Perhaps it’s just a case of ‘watchable’ content being hard to discover due to the problems I mentioned above. Lovefilm is clearly trying because even within the 45-day window I tried the service, I saw new titles getting added. When it comes to TV shows, it hardly has anything that is not already available elsewhere such as YouTube or Channel 4’s 40D.

Anyway, when the time came to take a decision on whether I want to renew my subscription to become a paying customer, I simply didn’t feel the content library was rich enough or the discovery UX intuitive enough to be worth paying for. Obviously, issues with playback not working properly was a factor. Many of the videos were only available in standard-definition, which is baffling in this age when so much content is available in HD. Another factor was that Lovefilm does not (yet) offer any way to stream content to mobile devices; its Android app only allows you to add DVDs to your rental queue.

Lovefilm Instant feels almost like an afterthought to its core DVD rental business. It’s a shame that it cannot give good recommendations either, since with Amazon’s surprisingly accurate shopping recommendations and IMDb’s rating database, it should have enough data to go on which other players don’t have.

Netflix

Once my Lovefilm trial expired, I signed up for Netflix (£5.99 / month) – and so far I’m loving it. (I didn’t get a free trial because I’d previously signed up for a 30-day trial without using it.) Netflix starts you off with a quick questionnaire on preferred movie genres to personalise recommendations. The contrast with Lovefilm’s UI is stark as Netflix’s user interface is inherently more visual. Text is non-existent, making the user-interface more aesthetically pleasing than Lovefilm’s rigidly-structured pages. The whole user experience is centred around serendipity and discovery. Every time you visit Netflix, it presents a different set of thumbnails, making discovering newer titles or titles you may not have heard of incredibly simple.

More information on each title is presented using hovercards. What I like about this is that Netflix uses the initial questionnaire in addition to ratings you make on Netflix as you watch more titles to make a ‘best guess’ for a title’s rating in the eyes of each user. These personalised recommendations help you make snap decisions on whether you want to watch a title or not: if you’re in the mood to watch a film that you would definitely like, you’re likelier to choose titles with higher guessed ratings; or if you’re in the mood to experiment, then you might even consider titles with lower guessed ratings.

One of the things that Netflix’s has nailed really well is recommendations. In addition to a ‘browse’ feature that allow you to browse according to category, it also shows recommendations according to categories generated on-the-fly based on your ratings / preferences history. These fluid categories help you quickly discover films similar in tone to ones that you already like. Brilliant!

Netflix also nails ‘social’ recommendations. You can link your Facebook account to Netflix, and for your friends on the service who have done the same, it can show you films and TV shows that they have seen. The inherent idea behind this of course is that you’d be interested in watching films your friends watch, which is not a bad assumption to make because at any time for the ‘social’ recommendations you can rate a title as ‘Not interested’.

And the player UI is beautiful. Each title starts playing in a full browser-window sized player. It also starts playing content automatically and silently upgrades to HD quality once enough data has been buffered. What I particularly like about the player UX with respect to TV shows is that it automatically queues the next episode in the series after you’ve finished watching one. You can either sit back and let it continue, or use the countdown period to browse back to the main library interface. Netflix understands how episodes within a TV show are related by season, and automatic queueing makes for a great ‘leanback’, hands-free experience.

The reason why I singled out the lack of autocomplete as a UX deficiency in Lovefilm is that autocomplete – like Netflix does it – let’s you know right away whether a title you want is available or not. In case a title doesn’t show up while I’m typing, I know that right away without having to navigate away to search results page.

Netflix also offers streaming on tablets and mobile phones through its Android / iOS / Windows Phone apps. The UI is very similar to what’s offered on the desktop – so no need to figure out anything new. Hovercards are replaced using an ‘i’ information icon that pops up with the same information. Streaming content – both on desktop and mobile – just works, without any playback hiccups.

There is one thing that I felt Netflix got wrong though. I signed up for my Netflix account using Facebook Connect, and thus I never needed to set up a password when accessing it on my laptop. When I opened the mobile app though, I was frustrated to find that there’s no way to sign in without providing a password. I had to ask for a password reset on my account and then set up a password, just so that I could use my Netflix account from its mobile app. I don’t understand this omission, because there are other mobile apps which happily allow you to sign in using Facebook Connect.

Overall, I’m incredibly happy with Netflix and I think I will stay on as a customer. The content library is rich with a selection of foreign films, new releases, indie films, as well as top-notch American and British TV shows such as Dexter, Breaking Bad, Modern Family, The Inbetweeners, et al. There are sometimes cases where a title I’d find on Lovefilm Instant isn’t available but in general, Netflix UK seems to have a far broader selection of titles than any of its competition.

YouTube Rentals / Google Play

I mention YouTube Rentals / Google Play because Google markets this heavily on the Google Play Store. Unlike Lovefilm or Netflix, Google does not offer a fixed subscription that allows you access to a library. You need to buy each film separately, and then have 48 hours to watch it. Playback quality is good, as you’d expect from any standard HD video on YouTube. Content selection seems to be about the same as Netflix (sometimes better), minus the TV shows. It’s just not for me though, because buying access to each title at £2.49-3.49 a pop is way too expensive for me. Prices aren’t too out of line compared to Apple iTunes offerings though. I mention this only for the sake of completeness – I’d take an all-you-can-eat subscription any day over a piecemeal model.

Conclusion

Streaming services have been late to the party in the UK compared to the US, but if you are thinking of signing up for one now is a better time than ever with offerings being a lot more mature than they were when the services debuted. Subscription package costs are quite reasonable too for the content offered. I’d recommend that you sign up for trials on each service and see which one gives the best fit for your viewing tastes: you might find that either Lovefilm or Netflix is a better fit for what you like to watch.

I also haven’t tried NowTV – Sky’s new online streaming service. It’s significantly more expensive at £15 / month, but it also has a much larger library due to Sky’s weight in the general TV market. I may give it a go after a couple of months in case I find myself running out of stuff to watch on Netflix.

A cautionary note for Linux users: none of these services will work for you. Lovefilm and Netflix both require Silverlight player, to enforce DRM restrictions; and YouTube for some reason doesn’t allow playback for film rentals on Linux either (probably due to DRM reasons again) even though its standard player is Flash-based.

I think the big gap at the moment is in how broad and deep TV show libraries are on streaming services. Hulu still hasn’t launched in the UK, despite noises being made about it back since 2009. What I found for all the three streaming service here is that the content is either something already available for free from the channel on which they air, or, when available, restricted in the number of seasons that are available. That, however, is a broader industry issue – and I hope TV networks catch on to that fact that if they don’t make content available legally, people will simply get it when they want through illegal means. Movies also have the same ‘release window’ problem, but it’s much more acute with TV shows because streaming services usually catch up with many seasons later whereas with many films I’m happy to wait until the time they become available. (The films I really want to watch, I watch in the cinema.)

I think what I like best about streaming video – Netflix in particular – is that it makes finding new stuff to watch so effortless. I don’t need to go hunting for links on illegal streaming sites or worry about what quality the video will be. I like the simplicity and I want content creators to be compensated for their work. I hope that this march towards a future of on-demand content does not get bogged down with exclusive deals which effectively silo different content across multiple services.

N.B. I realise that it’s hard to objectively call Lovefilm’s UX bad without further data. Craigslist, for instance, stands out as an example of website design that doesn’t turn heads but clearly works for them. The only measure that can really speak definitively is data from split testing of design / functionality, or revenues. Here again, the problem is that Lovefilm may have higher revenue due to its volumes in the DVD retail business, so any comparison would need to be done on revenue purely for its streaming business versus Netflix’s. And that data is not easily available. Perhaps Lovefilm has found their design does work them, due to familiarity in the eyes of its users versus Netflix users who may be ‘savvier’ (again, a question that cannot be answered without knowing demographics). My intention in writing this blog post was to present what I felt about the design and service of the two contenders – in the hope that some people find it useful.