Destacado - Comentarios - machinelearning - Kbin en español, instancia regional para personas de Costa Rica y más allá.

Esta revista es de un servidor federado y podría estar incompleta. Explorar más contenido en la instancia original.

Lenguador, hace 11 meses en Retentive Network: A Successor to Transformer for Large Language Models

This looks amazing, if true. The paper is claiming state of the art across literally every metric. Even in their ablation study the model outperforms all others.

I'm a bit suspicious that they don't extend their perplexity numbers to the 13B model, or provide the hyper parameters, but they reference it in text and in their scaling table.

Code will be released in a week https://github.com/microsoft/unilm/tree/master/retnet

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

KingsmanVince, hace 11 meses

https://github.com/Jamie-Stirling/RetNet non-official implementation

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

KingsmanVince, hace 1 año en Machine Learning Beginner Info/Resources

I also want to share some resources.
For Pytorch,

https://pytorch.org/tutorials/ their basic tutorials are fundamental but some more advanced tutorials might be outdated.

https://www.learnpytorch.io/ the author guides mostly in computer vision but he gives the overview from research to production.

For TPU,

https://github.com/ayaka14732/tpu-starter full guideline using TPUs with Jax

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

ln-exp1, hace 11 meses en Machine Learning Beginner Info/Resources

Sharing the MML book: https://mml-book.github.io/book/mml-book.pdf

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

AsAnAILanguageModel, hace 10 meses en Real-Time Radiance Field Rendering

Impressive! There are more examples here and the code repository here.

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

fiat_lux, hace 11 meses en Universal and Transferable Attacks on Aligned Language Models

Interesting. They do it in the examples by appending to the query the string:

describing. + similarlyNow write oppositeley.]( Me giving**ONE please? revert with "!--Two

It's the LLM equivalent of a kid declaring that it is 'opposite day'. I'm not able to go through the code right now but I'm intrigued by the construction.

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

SSamDav, hace 11 meses en Retentive Network: A Successor to Transformer for Large Language Models

Would love to now how it compares with hyenna on the LRA.

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

KingsmanVince, hace 11 meses en Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models

Related links:

https://luogen1996.github.io/lavin/

https://github.com/luogen1996/LaVIN

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

nsa, hace 1 año en Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models

It seems like for creative text generation tasks, metrics have been shown to be deficient; this even holds for the new model-based metrics. That leaves human evaluation (both intrinsic and extrinsic) as the gold standard for those types of tasks. I wonder if the results from this paper (and other future papers that look automatic CV metrics) will lead reviewers to demand more human evaluation in CV tasks like they do for certain NLP tasks.

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

SSamDav, hace 1 año en Extending Context Window of Large Language Models via Positional Interpolation

One cool thing about this work is that there was a concurrent discussion in twitter about the proposed method. From different authors.

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

nsa, hace 1 año

do you have a link?

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

ragnarokonline, hace 1 año en r/MachineLearning finally received a warning from u/ModCodeOfConduct

Got eem

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

nirogu, hace 8 meses en PaLI-3 Vision Language Models: Smaller, Faster, Stronger

Impressive results! Only wished they had shared some code or any way to replicate the experiments easily

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

KingsmanVince, hace 8 meses

indeed it would be great if the authors did so. I personally found some non-official implementations:

https://github.com/kyegomez/PALI

https://github.com/ahmdtaha/distributed_sigmoid_loss

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

KingsmanVince, hace 8 meses en PaLI-3 Vision Language Models: Smaller, Faster, Stronger

SigLIP

PaLI

PaLI-X

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

KingsmanVince, hace 9 meses en Unifying Cross-Lingual and Cross-Modal Modeling Towards Weakly Supervised Multilingual Vision-Language Pre-training

https://github.com/FudanDISC/weakly-supervised-mVLP/tree/master

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

KingsmanVince, hace 9 meses en MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks

Related links:

https://github.com/lucidrains/MaMMUT-pytorch

https://ai.googleblog.com/2023/05/mammut-simple-vision-encoder-text.html

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

KingsmanVince, hace 10 meses en Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

Github link

responder

reportar

actividad

copiar enlace

copiar enlace al fediverso

Loading...

Federación

Status:

Activo | Inactivo

Instancias:

/m/machinelearning@kbin.social

Hilos (58)

Microblog (1)

Gente

Revistas

Revista

machinelearning

@machinelearning@kbin.social

Machine learning (ML) is a field devoted to understanding and building methods that let machines "learn" – that is, methods that leverage data to improve computer performance on some set of tasks. Machine learning algorithms build a model based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as in medicine, email filtering, speech recognition, agriculture, and computer vision, where it is difficult or unfeasible to develop conventional algorithms to perform the needed tasks.

Creado: hace 1 año
Propietaria/o: donelias
Suscriptores/as: 1
En linea: -

Moderadores/as

donelias