In a new research paper detailing its AI training capabilities for the iPhone and other products, tech giant Apple appears to have chosen to rely on Google's chips over market leader NVIDIA's. In the paper, Apple shares that its 2.73 billion parameter Apple Foundation Model (AFM) uses v4 and v5p tensor processing unit (TPU) cloud clusters provided by Alphabet Inc.'s Google.
Apple reveals it is training a 6.3 trillion token AI model "from scratch" on "8192 TPUv4 chips". For on-device AI models used for functions such as typing and image selection, Apple uses a 6.4 billion parameter model trained on 2048 TPU v5p chips.
Other details in the paper include evaluating the model for harmful responses, sensitive issues, factual correctness, mathematical performance, and human satisfaction with the model's output. According to Apple, AFM server and on-device models lead the industry in suppression of malicious outputs.
For example, the AFM server had a malicious output violation rate of 6.3% compared to OpenAI's GPT-4, which had a rate of 28.8%. Similarly, on-device AFM had a violation rate of 7.5% compared to Llama-3-8B's score of 21.8%.
For summarizing emails, messages and notifications, on-device AFM had a satisfaction rate of 71.3%, 63% and 74.9% respectively. The research paper shared that these led over the Llama, Gemma and Phi-3 models.