• 0 Posts
  • 175 Comments
Joined 11 months ago
cake
Cake day: July 28th, 2023

help-circle



  • London is full of excellent amazing things but they’re spread out over an absurdly large area so it’s such a pain doing anything. And everyone who lives there is so numb to it! They’ll happily indulge every day in 3-4 hours of public transport as if this is a rational way to live.

    I’m very happy that they have a reasonably decent transit system, but fuck me I wanted those 4 hours in my life actually.








  • The Turing test is flawed, because while it is supposed to test for intelligence it really just tests for a convincing fake. Depending on how you set it up I wouldn’t be surprised if a modern LLM could pass it, at least some of the time. That doesn’t mean they are intelligent, they aren’t, but I don’t think the Turing test is good justification.

    For me the only justification you need is that they predict one word (or even letter!) at a time. ChatGPT doesn’t plan a whole sentence out in advance, it works token by token… The input to each prediction is just everything so far, up to the last word. When it starts writing “As…” it has no concept of the fact that it’s going to write “…an AI A language model” until it gets through those words.

    Frankly, given that fact it’s amazing that LLMs can be as powerful as they are. They don’t check anything, think about their answer, or even consider how to phrase a sentence. Everything they do comes from predicting the next token… An incredible piece of technology, despite it’s obvious flaws.


  • Do you think the paper drew sensible conclusions, or do you just not like my arguments?

    A correlation coefficient of .5 is in the ballpark of or bigger than the correlation between human height and weight. I wouldn’t be surprised if the bottleneck isn’t in the reliability of the measurement.

    This is fair enough, my background is not in social research so to me 0.5 is a moderate correlation. Not sure what you mean by the ‘bottleneck’ here, are you suggesting that the correlations could be higher with a different survey?

    Unmodeled interactions here also would only be able to suppress the explained variance - adding them in could only increase the R-squared!

    Given that the explanatory variables are in some cases more strongly correlated with each other than the response, do you think the model without interactions is likely to be an appropriate way to analyse the relationship between the response and the explanatory variables? It doesn’t at all make sense to me to do one single regression model and say “The F test says this is a good model, so the explanatory variables explain the response”, especially with a relatively low R^2, and given the fact that there is evidence of multicollinearity presented alongside!

    The paper presents the fact that they have done a regression model with a few good significances without any real analysis of if that model is good. We don’t see if the relationships are linear, we don’t see if the model assumptions are met. Just doing a regression is not enough, in my opinion.

    In case your 101 course hasn’t covered that yet:

    There’s no need to be rude. It’s perfectly acceptable to disagree with me, but you could do it politely.

    F-tests are also commonly used when performing an analysis of variance.

    Yes, I’m well aware, although I’m not sure what your point is. They haven’t done any analysis of variance.

    As is it’s impossible to say if the model they found is actually very good.

    You say that after quoting explained variance, which is much more useful (could use confidence intervals… but significance substitutes here a little) in this context for judging how good a model is in absolute terms than some model comparison would be (which could give relative goodness).

    My point is that they haven’t made any effort to find a model that best fits the data, they have just taken all the available variables, smacked them into python or R or whatever, and written down the statistics it spits out. There’s no consideration in the paper given to interpreting the statistics, or to confirming their validity.

    From the study:

    Although the regression weight for age was not significant, the direction was negative, suggesting greater endorsement for the car items for the younger sample.

    Not only was p-value for age clearly not significant, the confidence interval for the coefficient was [–.21, .17]… This includes 0 ffs! There’s no evidence here that there is greater endorsement of the car items in younger respondents. Why was age even included in the model in the first place, given that the correlation was near 0?

    Like I said - there is some evidence here of an interaction, I’ll concede that in context the correlation isn’t bad for 2 of the dark tetrad items, Wild and Crafty, but the analysis they have used to present this information is not well thought out or presented. Personally I don’t think that a linear regression model is even the right way to analyse the data they have, I especially don’t think this regression model is a good way to analyse the data.


  • You’re right to be sceptical. The paper is poorly written, and overstates many of the results they found. The correlations identified between the car score and the dark tetrad scores aren’t really very high, the highest is 0.51! They produced a regression model and deduced that because the F-test had a low p value that the dark tetrad scores predicted the car score. The F-test, for clarity, determines if a model predicts the response variable better than a model with no explanatory variables.

    Also worth noting that there were stronger correlations between the explanatory variables than for any of the explanatory variables with the response. They should have included interactions in their regression model to incorporate this, or even better tried a set of models and compared them with ANOVA or similar. As is it’s impossible to say if the model they found is actually very good. It only explains 29% of the variance which… Well, it’s a statistic which is better for comparing models, but it suggests quite clearly they most of the variance in the car score is not explained but the dark tetrad scores.

    There’s a smattering of evidence in here that there’s some statistical link between the scores, but it’s not been well explored or presented, and there are issues with the statistical approach. Based on some comments in the discussion section I’d agree with your suggestion that the author is simply trying to confirm their hypothesis.



  • BluesF@lemmy.worldto196@lemmy.blahaj.zoneRule
    link
    fedilink
    arrow-up
    10
    ·
    29 days ago

    Everything? That’s pretty impressive. I wonder how many illegal things you could realistically pack into your day… There are many inefficient legal things involved in my day, it would be tricky to make them all illegal. If I steal a lot of stuff in advance maybe then using it becomes sort of illegal? Tricky.



  • BluesF@lemmy.worldto196@lemmy.blahaj.zone“rule” eh?
    link
    fedilink
    arrow-up
    4
    ·
    edit-2
    1 month ago

    It’s a r/bonehurtingjuice meme (a little like an anti-meme, but bonehurtingjuice takes memes and recontextualises them).

    The original comic it’s referencing shows an air hostess asking someone to turn off a tablet for takeoff. An older man, holding a book, chimes in with ‘I guess I don’t have to turn mine “off” eh? Ha! Heh heh.’

    The laugh in particular is so off kilter and strange it became a meme in itself, a very persistent one used in tons of other bonehurtingjuice memes.



  • BluesF@lemmy.worldto196@lemmy.blahaj.zone📄 rule
    link
    fedilink
    arrow-up
    43
    ·
    1 month ago

    The annoying “letter” paper size is for some unknown reason what windows always sets as the paper size unless I change it to A4 manually. Naturally if I forget the printer won’t print. US paper sizing - annoying me on the other side of the Atlantic.