The most popular and comprehensive Open Source ECM platform
Big tech companies like Google and Facebook have the AI advantage of size. The massive archive of data that these companies have collected is said to give them an equally big competitive advantage when it comes to training AI algorithms. Typically, the more data available, the better the algorithm is able to train and ultimately perform.
To minimize the size of the datasets being used to train AI algorithms, some data scientists are testing ways to synthetically generate data.
Researchers in Japan have tried to use fractals to generate images to use to train AI algorithms. The initial results have been good. The resulting trained model performed almost as well as when models are created using very-large carefully-curated image data sets, like ImageNet and Places.
James Clark, entrepreneur and computer scientist, said that “if researchers can build simulators to generate arbitrary amounts of data, they might be able to further change the cost curve of data generation. This might have weird economic and strategic implications: if you can simulate your data using a computer program, then you can change the ratio of real versus simulated/augmented data you need. This has the potential to both speed up AI development and also increase the inherent value of computers as primary AI infrastructure – not only can we use these devices to train and develop algorithms, but we can use them to generate the input ‘fuel’ for some of the more interesting capabilities.”