This New AI Could Be the First Real ARTIFICIAL BRAIN!
AI Summary
Summary of LM (Topographic Language Model) Introduced
Overview: A new AI model, LM (Topographic Language Model), developed at FFL’s NeuroAI laboratory in Switzerland, mimics human brain language processing.
Key Contributors: Assistant Professor Martin Cramp and his team
Background: FMRI scans show brain clusters activated by language tasks (verbs, nouns, syntax). Previous models had success in visual tasks by mimicking this clustering in neurons.
Model Architecture:
- Based on a modified GPT-2 structure with 12 transformer blocks, 16 attention heads, 784 units per layer arranged in a 28x28 grid.
- Introduces a novel training objective: Spatial Smoothness Loss. The model is penalized for differences in activation of nearby units, thus enhancing spatial consistency among neurons.
Training Details: Trained on 10 billion tokens with heavy computational resources (4 NVIA 100 GPUs over five days). Validation stops when loss ceases to improve.
Performance Metrics:
- Achieved a cross-entropy of 3.075 with a spatial loss of 0.108.
- Comparative analysis shows a lower performance in grammar checks but improvement in NLP tasks.
Comparative Advantages:
- Exhibits clustering of language units akin to human brain activity. Proficiency in discerning verbs and nouns shows significant similarity (correlation increasing from 0.48 to 0.81 when considering nearby units).
- Retains substantial utility in common NLP applications despite slight grammar trade-offs.
Applications and Future Prospects:
- Potential for improved interpretable models and neuromorphic chips,
- Possible insights for medical neuroscience, targeting stimulation based on cortical coordinates to aid recovery from language deficits.
Conclusion: The Topo LM model successfully integrates spatial organization principles from neuroscience into language processing AI, indicating a promising synergy between AI and cognitive neuroscience. Further validation will come from ongoing scans to identify new language processing clusters in the brain.