Claus Wilke pfp

Claus Wilke

@clauswilke

654 Following
6038 Followers


Claus Wilke pfp
Claus Wilke
@clauswilke
We found that there are many pitfalls with reproducibility and in particular ESM C out of the box _will not_ give you reproducible embeddings. A major problem if you don't pay attention to it. Fortunately there are easy fixes. 2/2
0 reply
0 recast
0 reaction

Claus Wilke pfp
Claus Wilke
@clauswilke
We updated our survey of transfer learning with protein language models. Now it includes ESM C and AMPLIFY in addition to ESM-2 and ESMv. Quick conclusion: ESM C 600M is the model to use. But note reproducibility issues in next cast. 1/2 https://www.biorxiv.org/content/10.1101/2024.11.22.624936v2
1 reply
2 recasts
5 reactions

Brad Connell pfp
Brad Connell
@fauxjebus
Get ready for FEBRUAIRY! One prompt per day to inspire you to create AI art. Feel free to work ahead, or behind. Just make sure to post your work in here and include #Februairy2025 if you post anywhere else that supports hashtags (X, Instagram, Rodeo).
8 replies
18 recasts
40 reactions

Claus Wilke pfp
Claus Wilke
@clauswilke
I made an interactive worksheet that provides a basic introduction to the R language. Meant for people that may be familiar with Python or other languages and want to understand some of the core concepts of how R works. https://wilkelab.org/SDS366/worksheets/intro-to-R.html This is part of my class on data visualization with R.
2 replies
1 recast
4 reactions

Claus Wilke pfp
Claus Wilke
@clauswilke
Any chemical modification, and roughly 100 amino acids. Our context window is 768 tokens, and most amino acids require about 7 tokens to be represented, but details depend on the size of the amino acid. (The larger amino acids need more tokens.)
1 reply
0 recast
1 reaction

Claus Wilke pfp
Claus Wilke
@clauswilke
Despite the title which suggests this is particularly for predicting membrane diffusion, the model is general and can be fine-tuned to any downstream application you may be interested in. If you don't have access to the journal pub, the bioRxiv version is here: https://www.biorxiv.org/content/10.1101/2024.08.09.607221v2.abstract
0 reply
0 recast
1 reaction

Claus Wilke pfp
Claus Wilke
@clauswilke
Now published: We trained a chemical language model that works with both small molecules and peptides. Ideal for making predictions on peptides with non-standard amino acids or other chemical modifications. https://pubs.acs.org/doi/full/10.1021/acs.jcim.4c01441
3 replies
2 recasts
12 reactions

Claus Wilke pfp
Claus Wilke
@clauswilke
Yeah, Mueller is surrounded by a stretched out park and little lakes. I don't think many people are aware. It tends to be relatively empty, in particular on the east side.
1 reply
0 recast
1 reaction

Claus Wilke pfp
Claus Wilke
@clauswilke
It's there. https://maps.app.goo.gl/LkTucNmNdtCk6YqK9
1 reply
0 recast
1 reaction

Claus Wilke pfp
Claus Wilke
@clauswilke
Sunset in Austin, TX.
1 reply
1 recast
16 reactions

Claus Wilke pfp
Claus Wilke
@clauswilke
I'm referring to classes I teach at UT Austin. It's either for data science undergraduates or for students taking the online Masters in data science.
1 reply
0 recast
1 reaction

Claus Wilke pfp
Claus Wilke
@clauswilke
Btw., I'm reworking my class on this topic, now with interactive R worksheets that allow you to learn R dataviz in the browser. As of today I have about 25% of the material copied over to the new format. https://wilkelab.org/SDS366/
0 reply
1 recast
1 reaction

Claus Wilke pfp
Claus Wilke
@clauswilke
And let's not even get to handling of missing values, which is a mess in Python because it doesn't have a native missing value datatype.
1 reply
0 recast
1 reaction

Claus Wilke pfp
Claus Wilke
@clauswilke
This is a never-ending discussion when I teach R in my dataviz classes and all the students ask why not Python. In my experience (as a user of both), if you have tabular data and want to do either interactive exploration of the data, statistical modeling, or publication-quality visualization, R beats Python by a lot. There's a reason the Python people copy all the R/tidyverse concepts, with libraries such as pandas, plotnine, etc. (And yet they're generally much more clunky than the original.) At a minimum, I feel people should know both and then make an informed choice.
2 replies
2 recasts
2 reactions

Claus Wilke pfp
Claus Wilke
@clauswilke
Piter, apologies, I did not mean to offend. I thought this was a fun way for the AI community to engage with genuary. If you have a strict no-AI policy I am happy to abide by it. Just let me (and everybody else here) know. Going by the number of likes my post got, there seems to be substantial community interest, but I'm happy to take down my post and tell people not to post AI- generated genuary art here in this channel.
1 reply
0 recast
2 reactions

Claus Wilke pfp
Claus Wilke
@clauswilke
Sure. It's your art, you can do with it whatever you want. It's just a fun activity with multiple people doing their own interpretation of the same prompt each day.
0 reply
0 recast
2 reactions

Claus Wilke pfp
Claus Wilke
@clauswilke
You may remember that last January I did all prompts of Gen-AI-uary 2024. Here is a video compilation of all the artworks I made then.
1 reply
0 recast
2 reactions

Claus Wilke pfp
Claus Wilke
@clauswilke
Get ready for Gen-AI-uary 2025, the AI version of this popular generative artists activity for the first month of each year. The prompts are available. Do one per day. Do them with AI. Post them in /ai-art. https://warpcast.com/piterpasma/0x1e9bdce8
4 replies
6 recasts
42 reactions

Piter Pasma  pfp
Piter Pasma
@piterpasma
#GENUARY2025 PROMPTS ARE READY!! GENUARY is an artificially generated month of time where we build code that makes beautiful things. It’s happening during the month of January 2025, and everybody is invited! https://GENUARY.ART #generative #genartclub
3 replies
15 recasts
35 reactions

anoncast pfp
anoncast
@anoncast
maybe so, but it’s been around too long to die now
2 replies
3 recasts
20 reactions