an AI resume screener had been trained on CVs of employees already at the firm, giving people extra marks if they listed “baseball” or “basketball” – hobbies that were linked to more successful staff, often men. Those who mentioned “softball” – typically women – were downgraded.

Marginalised groups often “fall through the cracks, because they have different hobbies, they went to different schools”

  • spujb@lemmy.cafe
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    2
    ·
    edit-2
    9 months ago

    fucking bonkers that institutionalized racism can exist to such a degree that it shows up IN OUR COMPUTERS.

    we’re so racist we made the computers discriminatory too.

    • TheMurphy@lemmy.world
      link
      fedilink
      English
      arrow-up
      13
      arrow-down
      2
      ·
      9 months ago

      I don’t think you know how LLM’s are trained then. It can become racist by mistake.

      An example is, that there’s 100.000 white people and 50.000 black people in a society. The statistic shows that there has been hired 50% more white people than black. What does this tell you?

      Obvious! There’s also 50% more white people to begin with, so black and white people are hired at the same rate! But what does the AI see?

      It sees 50% increase in hiring white people. And then it can lean towards doing the same.

      You see how this was / is in no way racist, but it ends up as it, as a consequence of something completely different.

      TLDR People are still racist though, but it’s not always why the AI is.

      • BluesF@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        9 months ago

        The bias is really introduced at the design stage. Designers should be aware of demographic differences and incorporate that into the model to produce something more balanced. It’s far from impossible to design models that do not become biased in this way, even from biased data - although, that is no to say it’s easy.

      • Nollij@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        2
        ·
        9 months ago

        I suppose it depends on how you define by mistake. Your example is an odd bit of narrowing the dataset, which I would certainly describe as an unintended error in the design. But the original is more pertinent- it wasn’t intended to be sexist (etc). But since it was designed to mimic us, it also copied our bad decisions.

        • Vanth@reddthat.com
          link
          fedilink
          English
          arrow-up
          2
          ·
          9 months ago

          Oh man, so many good examples of this.

          See the photo recognition software and smartwatch sensors that don’t work as well for black people because no one thought to make sure black people were adequately represented in the test data.

          Or the decades of medical research based on only male mice because female mice have different hormone levels that introduce more variability (just like female humans) and that’s like, just too much work to deal with and it’s easier to assume the female body responds to medications the same way the male body does.

      • spujb@lemmy.cafe
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        edit-2
        9 months ago

        you are right, i don’t know how LLMs are trained, but ironically, this is a perfect example of a minority being privelaged by a system, and racism is still very much involved.

        an important assumption you have to consider: in your example, why did the AI know what race people are in the first place? it seems a small consideration but it’s so wildly significant.

        the modern understanding of race was not present throughout all of history, and only arose in the 17th century. without getting into the weeds, the fact that your fictional AI can distinguish between whiteness and non-whiteness already means it was designed by someone who understands those structures, and let them slip into the AI itself.

        a perfectly well-meaning and anti-racist designer would prevent the AI from even recognizing race at all costs, both directly by sanitizing training data to remove race from the inputs, and indirectly by noting correlations with other data (such as sports, in this article) and controlling for that.

      • msage@programming.dev
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 months ago

        Oh there is so much racist data that the AI is being trained on.

        Your example is a simple one. But there are discriminations based on names for instance, so Johns are hired more than Quachin is, and that is by people, before it gets to the AI.