Data generated by a computer process (possibly an AI system) rather than collected from real-world sources. Synthetic data can reduce privacy concerns and augment training sets but may not accurately represent real-world distributions or edge cases.
See: Data augmentation; Model collapse; Privacy; Training data