• Windex007@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    1 day ago

    The methodology sounds bizarrely complex to me for the purposes of establishing comparative information transfer rate.

    Wouldn’t just timing how long it takes to communicate a controlled set of information answer that?

    I’m confused by the concept of establishing an average “bitrate per syllable” and multiplying that through. Is this trying to address cases where language constructs DEMAND additional information be encoded in speech? Can one not construct a set of information intended to be communicated that could account for those quirks? Find some “lowest common denominator” sentences?

    I feel like I’m missing something and I’m very curious about what my faulty assumption is

    • Lvxferre [he/him]@mander.xyz
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 day ago

      Can one not construct a set of information intended to be communicated that could account for those quirks? Find some “lowest common denominator” sentences?

      I think this would require deeper knowledge of all 17 languages in question, and be a potential source of errors - for example, if you include some info in the set that is easier/harder to convey succinctly in one language than in the other languages.

      In the meantime, it’s easy to get good averages for bits/syllable and syllables/second, even if you don’t know the languages in question.

      • Windex007@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        14 hours ago

        I agree there would be challenges around information selectively. I expect Runasimmi can speak more “quickly” “efficiently” about labour-based taxation in the form of terraced plateaus growing cocoa than Inuktitut, but would find itself in deep contrast in the opposite direction speaking of the ice flo route and the associated ice quality a polar bear took hunting a seal.

        Also, just because a syllable “encodes more bits on average” does it imply faster transmission rate? Just because French encodes gender information into it’s language and syllables, isn’t knowing the gender of a shovel at best “check bits?” Used for detecting transmission errors but not intrinsically critical data?

        I’m not a linguist. I’m barely a scientist. I’m fascinated by the assertion that it’s easy to establish “bits per second” on syllables having somehow abstracted away social context. I’m not saying you’re wrong or they’re wrong, just that this rubs my naive intuitions exactly the wrong way… Which speaks more to the quality of my intuitions (apparently quite bad) rather than the real science by people actually in the field.

        • Lvxferre [he/him]@mander.xyz
          link
          fedilink
          English
          arrow-up
          2
          ·
          11 hours ago

          Also, just because a syllable “encodes more bits on average” does it imply faster transmission rate?

          If by “faster” you’re measuring:

          • the transmission per syllable - then yes
          • the transmission per second - then no

          This is easier to see in the original paper than in the OP. Check page 3; the second column is the rate of transmission per second, it’s roughly 35~45 bits/s for all of them.

          Just because French encodes gender information into it’s language and syllables, isn’t knowing the gender of a shovel at best “check bits?” Used for detecting transmission errors but not intrinsically critical data?

          At least in theory, redundancy required by [gender, number, case, etc.] agreement shouldn’t count, as it isn’t adding new information - it’s only repeating info already provided. In practice it’s hard to model this, so the numbers for gendered languages might be a bit overestimated.

          Note however gender has a second role, besides agreement: derivation. Derivation should actually increase bits/second, since it allows you to convey succinctly some stuff.

          I’m fascinated by the assertion that it’s easy to establish “bits per second” on syllables having somehow abstracted away social context.

          The social context (and the context, as a whole) plays a huge role on that, as paralinguistic information. However the scope there is only the linguistic info, encoded by the language itself.

          • Windex007@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            edit-2
            9 hours ago

            Note however gender has a second role, besides agreement: derivation

            Interesting… I hadn’t considered that this might enable linguistic “shorthands”, is that the implication?

            Sounds to me on the whole like you’re saying that the bitrate per syllable is solid and doing the heavy lifting here?

            It’s super interesting; and the implications are actually huge.

            I’d be interested in follow up studies to examine emergent linguistic patterns. Can we weigh syllabic encoding by common usage by age? If we eliminate “thouest” from the dictionary but include “skibidi” how does that skew patterns for informational density?

            Science is so fucking cool and I’m stoked that people nerd out on shit that I’m an idiot about so I can learn about the nature of the world.

  • schnurrito@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 day ago

    The article is from 2019. I was wondering because my impression was that was pretty old news and common knowledge by now.