• planish@sh.itjust.works
    link
    fedilink
    arrow-up
    4
    ·
    1 year ago

    The problem is they have no idea about the internal structure of the tokens they use, except what’s present in the data set. The model sees “Kenya” as 8473 299 = Ken ya or something, and how is it supposed to know token 8473, often used for the name of Barbie’s boyfriend, starts with K?

    Also they love to make up Fun Facts.