The most popular and comprehensive Open Source ECM platform
Apache Arrow: Solving AI-Hardware Impedance Mismatch
Object oriented databases were devised as a solution to a problem frequently called ‘impedance mismatch’. The problem was that there is a huge overhead of constantly reformatting object data to move back and forth from object-based programming languages like Java and C++ into a structured table format of a relational database.
AI is experiencing its own kind of ‘impedance mismatch’ headache between AI algorithms and hardware.
Wes McKinney, author of the Pandas Python library, told EETimes that “the data science tools themselves in languages like Python and R lagged significantly behind advances in computing hardware. Most such tools are not designed to run on multi-cores, GPUs, or systems with lots of RAM because they were designed a decade ago when you didn’t have things like a CPU with 16 cores. Similarly, a massive acceleration of memory has occurred. Disk drives are getting a lot faster. You also have solid-state drives. It’s not just about memory speeds, but your ability to get access to data is getting faster.”
McKinney has joined the Apache Arrow project which is an open-source programming-language-independent tool to make AI software, data, and AI hardware work together more smoothly. Arrow supports modern CPUs and GPUs and provides fast data transfer and access. It supports efficient in-memory computing, serialization of data into columnar format, and the efficient exchange of data.
