• bamboo@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    6
    ·
    há 9 meses

    Rather than eliminating the some of the training data, you could add more training data to create an even balance.

    • kromem@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      há 9 meses

      Indeed - there’s a very good argument for using synthetic data to introduce diversity as long as you can avoid model collapse.