Google PlusFacebookTwitter

Amazon’s Fire Phone is incredibly smart…and what it means for the future of smartphones

By on Aug 25, 2014 in Tech Takes | 2 comments

The announcement of the Amazon Fire Phone is one of the most interesting technology news I’ve come across in recent times. While the jury is out on whether it will be a commercial success or not (UPDATE: recent estimates suggest it could be as low as 35,000), the features that the phone comes with got me thinking about the technical advancements that have made it possible. The caveat here is much of what follows is speculation – but I do have a background in research projects in speech recognition and computer vision related user experience research. I’m going to dive into why Fire Phone’s features are an exciting advance in computing, what it means for the future of phones in terms of end-user experience, and a killer feature I think many other pundits are missing out. Fire Phone’s 3D User Interface I did my final year research project on using eye tracking on mobile user interfaces as a method of user research. The problem with many current methods of eye tracking is that it requires specialised hardware – typically the approach is to use a camera that can “see” in infrared, illuminate the user’s eye using infrared, and using the glint from the eye to track the position of the eyeball relative to the infrared light sources. This works fabulously when the system is desktop-based. Chances are, the user is going to be within a certain range of distance from the screen, and facing it at a right angle. Since the infrared light sources are typically attached to corners of the screen – or an otherwise-known fixed distance – it’s relatively trivial to figure out the angles at which a glint is being picked up. Indeed, if you dive into research into this particular challenge in computer vision, you’ll mostly find variations of approaches on how to best use cameras in conjunction with infrared. The drawback to this approach is that the complexity involved vastly increases when it comes to mobile platforms. To figure out the angle at which glint is being received, it’s necessary to figure out the orientation of the phone from it’s gyroscope (current position) and accelerometer (how quickly the pose of the phone is changing in the world). In addition to this, the user themselves might be facing the phone at an angle rather than facing it at a right angle, which adds another level of complexity in estimating pose. (The reason this is needed is to estimate visual angles.) My research project’s approach was using techniques similar to a desktop-based eye tracking software called Opengazer coupled with pose estimation in mobiles to track eye gaze. Actually, before the Amazon Fire Phone there’s another phone which touted it had “eye tracking” (according to the NYT): Samsung Galaxy S IV. I don’t actually have an Samsung Galaxy to play with – nor did the patent mentioned in the New York Times article link above show any valid results – so I’m basing my guesses on demo videos. Using current computer vision software, given the proper lighting conditions, it’s easy to figure out whether the “pose” of a user’s head has changed: instead of a big, clean circular eyeball, you can figure out there’s an oblong eyeball instead which suggests the user has tilted their head up or down. (The “tilt device” option for Samsung’s Eye Scroll, on the other hand, isn’t eye tracking at all as it’s just using the accelerometer / gyroscope to figure out the device is being tilted.) What I don’t think the Samsung Galaxy S IV can do with any accuracy is pinpoint where a user is looking at the screen beyond the “it’s changed from a face at right angle to something else”. What makes the Fire Phone’s “3D capabilities” impressive? Watch the demo video above of Jeff Bezos showing off the Fire Phone’s 3D capabilities. As you can see, it goes beyond the current state-of-the-art that the Galaxy S IV has – in the sense that to accurately follow and tilt the perspective based on a user’s gaze, the eye tracking has to be incredibly accurate. Specifically, instead of merely basing motion on how the device is tilted or how the user moves their head from a right angle perspective, it needs to combine device tilt pose, head tilt / pose, as well as computer vision pattern recognition to figure out the visual angles the user is looking at an object from. Here’s where Amazon has another trick up its sleeve. Remember how I mentioned that glints off infrared light sources can be used to track eye position? Turns out that the Fire Phone uses precisely that setup – it has four front cameras, each with its own individual infrared light source to accurately estimate pose along all three axes. (And in terms of previous research, most desktop-based eye tracking systems that are considered accurate also use at least three fixed infrared light sources.) So to recap, here’s my best guess on how Amazon is doing it’s 3D shebang: Four individual cameras, each with it’s own infrared light source. Four individual image streams that need to be combined to form a 3D perspective… …combined with device position in the real world, based on its gyroscope… …and how quickly that world is changing based on its accelerometer Just dealing with one image stream alone, on a mobile device, is a computationally...

July 2011

By on Jul 31, 2011 in Personal | 2 comments

I’ve been away from my blog for such a long time that it’s easiest to get over with in list style: I turned 21! This was my first birthday spent away from family, the first birthday away from school friends – and I was a touch saddened by that. Then again, my awesomesauce friends in Singapore made sure it was a memorable day. Do I consider this a milestone (kilometrestone doesn’t have the same ring to it)? Definitely, especially, because I spent the better part of a year in Singapore. I spent the past two months working on a research project in NTU. I was under the Division of Information Engineering, in a team working on a next-generation touch computing interface called STATINA. My task was one of the branch-offs associated with the touch computing project: to make a continuous speech recognition engine that could work with Asian accents. The basis of my project was on the ubiquitous Cambridge University Engineering Department software toolkit HTK, based on data recorded at NTU. This was fun, as speech recognition has been one of the areas that has drawn me in over the past months and I got something meaty to chew on while contributing to an existing research project. I was glad to have a supportive professor and PhD mentor to provide me guidance throughout the research project. If I had to single out one thing, I think my main contribution would have been using my readings on linguistics to approach the problem from not just a technical standpoint. The research project was under the Summer Research Internship Programme (SRI) run by NTU and sponsored by the Singapore government (I think, at some level). I highly recommend it to everyone for the exposure it gives you to ‘real’ research. Don’t expect to change the world in the eight weeks or so that you get, this is more like a taster. It pays well too – about S$3000 for two months – and you get experience the culture of an alien country. It’s just incredible to meet 50-odd people from around the world and go through this journey of discovering Singapore all over again through the social events organised – we had regular parties and events to bond over in the weeks here. If this video doesn’t convince you, I don’t know what will! I will be leaving Singapore for good – at least for the foreseeable future – on the 4th of August. I’ve lived there just a week or two short of a full calendar year, and all the events of the past year make this one stand out in my life so far prominently. I loved and lost (long distance really doesn’t work out, so it’s better to live in the moment) and loved again and lost and then some more. The past 2-3 weeks have been pretty eventful in ways more than one (and not just because of my birthday) including some wicked parties (71st floor of Swissotel – on a helipad!). I have also been to more traditional, ‘heartland’ Singapore and partaken in activities and food that makes Singaporean citizens cry tears of joy. I dyed my hair blue-black again with less than spectacular results. Not impressed. Semi-permanent dyes seem to give a stronger effect but last less; when they start fading they look hideous. Permanent dyes stick longer but getting the shade just right is hard. Still, almost-there blue-black is better than being a ginger as I once was. Despite the punishing my hair took when I bleached it, I think my hair’s in better health now than it was a year ago. Carry rubber with you at all times. Like, seriously. I found a year-long undergraduate placement in the UK! This had been a huge challenge, as only a handful of companies ever agreed to interview me over phone or video conferencing. I was also actively exploring the option of working in Singapore (most actively pursued, although visa issues were a major hiccup; furthermore, tech companies mostly have business / sales presence here rather than a technical one), Malaysia (Penang is a hotbed of electronics manufacturing), Hong Kong (opportunities were mostly in the business / finance ), Taiwan (d’oh, the electronics industry here is HUGE!). I’m glad to find a company that I really like though, which I will be joining in mid-August. I won’t be returning to my university as this job is based out of Fareham in a company that deals in IC design software and fabrication. And while I’m sure there’s something learn from every internship industrial placement / internship – this is one of the reasons why I opted for a ‘sandwich year’ in the first place – I’m so happy to find a company that offers me a blend of electronics and software work to sink my teeth into. I still need to find accommodation and that’s probably going to eat up my time in the first few weeks back in the UK. Fareham is kinda located midway between the cities of Southampton and Portsmouth, and I for one wouldn’t mind living in the lovely coastal city of Portsmouth. Speaking of Taiwan, I’m currently in Taipei City and will be here about six days. I have friends studying / working here whom I met on my summer internship as well as friends from Surrey University. And the nice-ass (literally, for...

“Wo de ming zi jiao zuo ‘Bao Zhi Sen'”

By on Sep 14, 2010 in On A Whim | 0 comments

The title of this blog post roughly translates to “My name is Ankur Banerjee”, where ‘Bao Zhi Sen’ is apparently my name in Mandarin. The surname comes first, where ‘Bao’ is ‘Banerjee’ translated phonetically, and ‘Zhi Sen’ is the literal translation of ‘Ankur’. I have no idea whether ‘Bao Zhi Sen’ means ‘flying monkey bollocks’ in reality, so I’ll have to trust the person who told me this. ;) I went to a Mandarin Chinese speaking session today. In case you didn’t know, the two main languages under the broad umbrella of what is called ‘Chinese’ are Mandarin and Cantonese – within which there’s a complicated spaghetti of other variants, dialects, grammar styles and whatnot. Mandarin is the most widely spoken form. It was an informal session where those who didn’t know Chinese were taught by those who did. Not all the volunteers were of Chinese origin – one of the teachers was a US Air Force Academy student here on study exchange who’d lived a few years on a US base in Taiwan! We were taught a bunch of handy phrases but to be honest I (and most of us there) didn’t get the pronunciations right for anything except the Mandarin numerals and basic greetings. ni hao (hello) is not the only word I know now. 42 in Mandarin is 四十二 (si shi er). I’m definitely getting this tattooed on my right forearm. PS – I forgot to add this video of Sheldon Cooper’s attempts at learning...

And Another…review

By on Dec 29, 2009 in Reviews | 0 comments

My rating of And Another Thing… by Eoin Colfer: 6 / 10 Publisher: Penguin Books ‘Twas inevitable. I should never have got my hopes up. I know I said I’d be attending Hitchcon 09, but when I finally landed here I also realized that if it took me two hours to go to some place practically next door right here in Guildford, then there was no chance in hell of me finding the venue to Hitchcon in a big city like London. That, and none of the bastards from the Sci-Fi Society at University of Surrey wanted to go there or had even heard of Hitchcon. Utter bastards. Vogons. May they have poetry read to them. I did wake up early morning on the day of the book release and went to the local Waterstone’s store – at 8am. This was a bit of an overreaction since the shop was scheduled to open at 10am, and to be frank there wasn’t any huge line outside it. Still, as a fanboi I expected Douglas Adams would want at least this much as a sacrifice – if not going to Hitchcon. I got my copy that very day, from WH Smith (Waterstone’s was slightly more expensive). That was way back in October. You’d have expected me to give a review of the book soon after buying it. I’d expect that too. Curiously however, I didn’t finish reading the book until yesterday. I’ve been trying to brush this off saying “I’ve been too busy”, but now I realize the real reason – I’ve been too scared so far to read it, in case the book wasn’t a worthy successor to the legacy Douglas Adams left behind. Eventually I decided enough was enough and get it over with. Eoin Colfer, with people who turned up for Hitchcon This authorized sequel to Douglas Adams’s Hitchhiker’s series is written by Eoin (pronounced ‘Owen’) Colfer, best known for is Artemis Fowl books. Damn, this introduction must have been used SO many times by so many people when describing this book. And therein lies the crux of the matter – whenever a book (or movie) has to resort to saying “…also written by this author”, it usually translates to ‘recipe for disaster’. This is despite the fact that I’ve heard a lot of praise for the Artemis books (I’ve never read them). Image by visitmanchester via Flickr And Another Thing… is not a bad book per se. Colfer puts in his best effort, but I agree with this review on io9 that it seems he’s “trying much too hard and also not quite trying hard enough”. When I look back at the time I spent yesterday reading the book, there was only one instance when I laughed-out-loud (“Focus, President Steatopygic. Focus.”). One. I did force a chuckle now and then but then that’s precisely how all the jokes feel – forced. Most of the jokes are done via the medium of half-hearted footnote-style ‘Guide notes’, about what the Guide would have to say on certain topics. What happens is that very quickly, this style of joke becomes monotonous. “It’s funny”, you realize in an almost clinical way, but then it doesn’t surprise you like the real Douglas Adams did. As I mentioned earlier, reading Douglas Adams is a bit like falling in love. Colfer plays it safe throughout the book. He doesn’t introduce any major new characters, concepts or locations; drawing instead on the various colourful people and locales DNA cooked up. In this regard, Eoin borrows from the radio series at places in the novel too. Whatever new bits he’s introduced are the bare minimum required to keep the story ticking. I can guess that this was done not to tick off ardent fans of DNA, but then for a lot of us that is what defined Adams’s work – quirky, unusual characters and each page brimming with caustic wit. On a more general note, the humour doesn’t always work because it’s ‘typically British’. I’ll speak more about this in the future in an epic blog post I’m penning down, tentatively titled Surely You Jest, Good Sir to discuss British catchphrases and their sense of humour. There are passages in the book which seem as if it’s Adams’s work channelled via Colfer as the medium, but those are few and far between. I also expected some sort of fusion between Hitchhiker’s and Dirk Gently storylines, given the amount of focus Thor has received in this book – that certainly would have been a bold move to make – but that was not to be. I won’t say And Another Thing… is a disaster, but it serves as a reminder that nobody can step into the shoes of the literary genius that was Douglas Noel Adams. Read it for the sake of entirety of the series…but it will probably leave you with a feeling of emptiness. PS – I miss Marvin! I miss Zem, the mattress! PPS – “Resistance is useless!” Related articles by Zemanta Colfer’s Hitchhiker’s Guide Sequel Will Be His Last (wired.com) Eoin Colfer interview: on The Hitchhiker’s Guide to the Galaxy (telegraph.co.uk) Tackling The Hitchhiker’s Guide to the Galaxy: An Interview with Eoin Colfer about “And Another Thing…” (omnivoracious.com) Don’t Panic! (news.bbc.co.uk) Hitchhiker at 30...

To entertain, amuse and inform

By on Nov 10, 2009 in Personal | 2 comments

Haven’t blogged for quite a while because…phew…I’ve been involved with so much stuff! So here’s a quick jog through them. There’s obviously university lectures which I *cough* always attend *cough*. We’re rapidly approaching the stage where we are going to start doing stuff I already haven’t covered in India, in some of the course. The next few weeks leading up to Christmas promise more to be become slightly more intensive as far as academics are concerned. One subject that is really challenging for a lot of us (including me) who have never done electronics practicals in school are the electronics labs; they’ve been designed around the premise that the students are already competent to A-Level UK curriculum electronics. Can be quite a steep learning curve when they expect you to run before you walk. However, our sweet professors are quite receptive to feedback and things are already in motion to change this next year so that a basic introduction is done for everyone. Thank this batch, future students. Also on a slightly academic note are my Spanish language classes. This is an optional add-on that you can take on, and the language course credits are added extra to your degree. Class strength has dropped by half compared to the first week. Our Spanish professor claims that he was either Fernando Alonso’s teacher or butler – my Spanish isn’t nearly good enough for me to comprehend which of the two is it. I get along fine as far as writing / reading is concerned, but speaking / listening can really puzzle a lot of us in class. Hopefully, the full week break next week (oh yeah, I’ve mid-sem holiday!) will help catch up. Moving on to other stuff. I got a job on campus with the University’s CoLab team. Basically, it’s about conducting workshops on various fields related to personal development, creative approaches to education, et al. They pay more than twice the minimum wage, so it’s really sweet! :) No wonder that 42 (I kid you not) people applied for 4 vacancies. The Stag (Issue 11) I’m the copy editor at the student union newspaper The Stag. It’s a fortnightly publication (and soon to available online) covering events on campus and around Guildford, along with a few special features. Boy, the amount of time that went into the final design stage of the latest issue was tiring…but at the end we had this academic year’s first issue. Will work on getting basic stuff online after coming back from the supermarket today, but The Stag team is working on a proper online edition. Read the latest issue online above (my bio is on page 4). Without a doubt though the thing I’ve gotten most involved with right now is MAD TV. That’s a student TV society which currently publishes student news programmes online (but we plan to expand to diverse shows next year). Made some amazing friends, both at MAD TV and The Stag. I’m involved with the technical side of things – filming, editing, sound. Again, it’s run completely by students – we’re a committed group of around 20 – and it’s been quite a ride. Shoots can be quite exhausting with those involved all but ready to give up by the end of the day. MAD TV has really taken off this year and we’ve been learning right from day 1. We’re constantly working on improving everything from the stories to the technical aspects of the shows we release, so watch out in the coming weeks! Watch Episode 4 of MAD TV (Part of technical team for ‘Change on Campus’ story) Watch Episode 5 of MAD TV (I worked on filming / editing the ‘Surrey Sports Park’ story) As I mentioned, there’s a mid-sem break coming up next week. Will use that to catch up on work. Might also use it to push out a few blog posts which have been in the works for a while. Learn a bit of proper cooking. Upload more pictures, including ones from the national UKSEDS 2009 space conference that our uni hosted this weekend. Catch up with friends back in India. Complete the assignment we have to submit! Continue balancing everything I’m involved with at uni. :) There are a few words which I tend to repeat a lot., don’t...