In 2012, Andrej Karpathy wrote a reflection on The state of Computer Vision and AI where he states his frustration on the level of maturity of current systems - at 2012.
If you are not familiar with his name, he is a computer scientist and current director of Artificial Intelligence at Tesla - and I am a huge fan of him. He runs a popular blog and his most recently post, which also help me get into #blockchain is called A from-scratch tour of Bitcoin in Python, 100% recommended if you are interested in the topic.
First thing I thought when reading the computer vision-related blog post was: WOW. 2012. A professional in the field of computer vision expressing sadness about the pace in which the field is advancing and by that time I was in a completely different reality, starting my second year of bachelor and wouldn't imagine that by 2021 I will be feeling the same way that he feels. If I have to be honest, I feel like this generally in the field I am working which is more about the Data Science aspect of AI but just the fact that - in Europe - I haven't been able to have a proper computer vision career - neither my colleagues from university - says a lot about the current state of this field in industry.
And so 2021, almost 10 years later - here we are. I still agree with his thoughts. He explains perfectly the reason why Computer Vision will be definitely an on-going 'unexplored' field for a long time.
It is mind blowing that even though, over the past years, we did advance in AI quite a lot, the essence that we capture in a scene or that characterizes someone's voice, which is basically constructed by the interaction of each individual with their external world and converted into experiences, will need such a format to be stored and learnt by a machine that it is quite crazy to even try to imagine it.
For me this is extremely exciting and one of the reasons I invite everyone to join this field, as every simple further step we make or knowledge we gain will always improve the field and that would be the new starting point !
Why my picture? I can easily trick a system, such as YOLOv3, to produce a label I feel identified with... however, can you imagine everything that was actually happening in that moment? ~ Everything that happened before that led to that moment?
What are your thoughts on this? Leave me a comment below 😀