Catégorie : Notes
-
PictureText: Interactive visuals of text | by Mihail Dungarov | Oct, 2020 | Medium
https://medium.com/@mihail.dungarov/picturetext-interactive-visuals-of-text-591e6375c1d6
-
Transformer-based Encoder-Decoder Models
https://huggingface.co/blog/encoder-decoder
-
GitHub – twintproject/twint: An advanced Twitter scraping & OSINT tool written in Python that doesn’t use Twitter’s API, allowing you to scrape a user’s followers, following, Tweets and more while evading most API limitations.
https://github.com/twintproject/twint
-
Creating Test Kubernetes Clusters With Kind – DZone Cloud
https://dzone.com/articles/creating-test-kubernetes-clusters-with-kind?utm_content=buffer269a0&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer
-
Machine Learning Algorithms For Beginners with Code Examples in Python
https://medium.com/towards-artificial-intelligence/machine-learning-algorithms-for-beginners-with-python-code-examples-ml-19c6afd60daa
-
10 October, 2020 07:52
https://github.com/vector-ai/vectorai
-
🎻Fine-tune Transformers in PyTorch using 🤗 Transformers | by George Mihaila | Oct, 2020 | Medium
https://medium.com/@gmihaila/fine-tune-transformers-in-pytorch-using-transformers-57b40450635
-
AI Training Method Exceeds GPT-3 Performance with 99.9% Fewer Parameters
"Using PET, the researchers fine-tuned an ALBERT Transformer model and achieved an average score of 76.8 on the SuperGLUE benchmark, compared to GPT-3’s 71.8." https://www.infoq.com/news/2020/10/training-exceeds-gpt3/ AI Training Method Exceeds GPT-3 Performance with 99.9% Fewer Parameters A team of scientists at LMU Munich have developed Pattern-Exploiting Training (PET), a deep-learning training technique for natural language processing…
-
Yann LeCun’s Deep Learning Course at CDS is Now Fully Online & Accessible to All
Course website: https://cds.nyu.edu/deep-learning/ Reddit forum: https://www.reddit.com/r/NYU_DeepLearning/ Github website: https://atcold.github.io/pytorch-Deep-Learning/ Blog post: https://medium.com/@NYUDataScience/yann-lecuns-deep-learning-course-at-cds-is-now-fully-online-accessible-to-all-787ddc8bf0af
-
1 line to BERT Word Embeddings with NLU in Python
https://medium.com/spark-nlp/1-line-to-bert-word-embeddings-with-nlu-f50d2b08cddc 1 line to BERT Word Embeddings with NLU in Python | by Christian Kasim Loan | spark-nlp | Sep, 2020 | Medium Including Part of Speech, Named Entity Recognition, Emotion Classification in the same line! With Bonus t-SNE plots! With the freshly released NLU library which gives you 350+ NLP models and 100+… medium.com
-
microsoft/presidio: Context aware, pluggable and customizable data protection and anonymization service for text and images
https://github.com/microsoft/presidio
-
You need just 5 minutes to create Synapse workspace and run your first Data Lake query!!! – Microsoft Tech Community
https://techcommunity.microsoft.com/t5/azure-synapse-analytics/you-need-just-5-minutes-to-create-synapse-workspace-and-run-your/ba-p/1750253
-
kelvins/awesome-mlops: A curated list of awesome MLOps tools
https://github.com/kelvins/awesome-mlops
-
MaartenGr/BERTopic: Leveraging BERT and a class-based TF-IDF to create easily interpretable topics.
https://github.com/MaartenGr/BERTopic
-
Database DevOps
https://www-red–gate-com.cdn.ampproject.org/c/s/www.red-gate.com/blog/database-devops/welcome-to-redgate-deploy-and-cross-platform-database-devops/amp
-
600 NLP Datasets and Glory. State of Big Bad NLP Database | by Quantum Stat | Towards AI — Multidisciplinary Science Journal | Oct, 2020 | Medium
https://medium.com/towards-artificial-intelligence/600-nlp-datasets-and-glory-4b0080bf5ab
-
The Big Bad NLP Database
https://datasets.quantumstat.com/
-
ML Inference on Edge devices with ONNX Runtime using Azure DevOps+MLOps – Microsoft Tech Community
https://techcommunity.microsoft.com/t5/ai-customer-engineering-team/ml-inference-on-edge-devices-with-onnx-runtime-using-azure/ba-p/1737331?WT.mc_id=DOP-MVP-4025064&utm_content=bufferfd11e&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer
-
The Open Source Data Science Masters by datasciencemasters
http://datasciencemasters.org/
-
Don’t Let Your .NET Applications Fail: Resiliency with Polly
https://hackernoon.com/dont-let-your-net-applications-fail-resiliency-with-polly-uz1z3t8t Don’t Let Your .NET Applications Fail: Resiliency with Polly | Hacker Noon One aspect of application development that is often overlooked, especially by beginner developers is application resilience. hackernoon.com
-
Missingno
https://github.com/ResidentMario/missingno ResidentMario/missingno: Missing data visualization module for Python. – GitHub missingno . Messy datasets? Missing values? missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities that allows you to get a quick visual summary of the completeness (or lack thereof) of your dataset. Just pip install missingno to get started..…
-
30 September, 2020 07:13
https://www.trainindatablog.com/feature-engineering-for-machine-learning-comprehensive-overview/
-
Spark NLP
https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/1hr_workshop/SparkNLP_openSource_workshop_1hr.ipynb
-
From Research to Production with Deep Semi-Supervised Learning
https://medium.com/@nairvarun18/from-research-to-production-with-deep-semi-supervised-learning-7caaedc39093 From Research to Production with Deep Semi-Supervised Learning | by Varun Nair | Sep, 2020 | Medium Diagram of how unlabeled images are used in FixMatch (Sohn et al., 2020). FixMatch was a simpler yet more effective version of its predecessor, MixMatch, and we successfully replicated their … medium.com
-
Achieving business resilience with cloud application development
https://azure.microsoft.com/fr-ca/blog/achieving-business-resilience-with-cloud-application-development Achieving business resilience with cloud application development | Blog Azure et mises à jour | Microsoft Azure When it comes to how people want to engage with organizations advancements in real-time—multichannel communication have raised expectations. Restrictions on physical interactions due to current events are accelerating adoption of remote, cloud-based solutions for customer engagement. azure.microsoft.com
-
Text Classification with NLP: Tf-Idf vs Word2Vec vs BERT
https://towardsdatascience.com/text-classification-with-nlp-tf-idf-vs-word2vec-vs-bert-41ff868d1794
-
Top NLP Libraries to Use 2020
https://towardsdatascience.com/top-nlp-libraries-to-use-2020-4f700cdb841f Top NLP Libraries to Use 2020 | Towards Data Science Cover Natural Language Processing. Natural Language Processing has been one of the most researched fields in deep learning in 2020, mostly due to its rising popularity, future potential, and support for a wide variety of applications. towardsdatascience.com
-
Snowflake vs Redshift: Why did you choose Snowflake over Amazon Redshift? | Alooma
https://www.alooma.com/answers/why-did-you-choose-snowflake-over-amazon-redshift-for-your-cloud-data-warehouse
-
What is a Lakehouse? – The Databricks Blog
https://databricks.com/fr/blog/2020/01/30/what-is-a-data-lakehouse.html
-
AutoScraper and Flask: Create an API From Any Website in Less Than 5 Minutes – DEV
https://dev.to/alirezamika/autoscraper-and-flask-create-an-api-from-any-website-in-less-than-5-minutes-400j
-
Advancing NLP with Efficient Projection-Based Model Architectures
https://ai.googleblog.com/2020/09/advancing-nlp-with-efficient-projection.html?m=1 https://github.com/tensorflow/models/tree/master/research/sequence_projection
-
2020 NLP Survey Report
https://gradientflow.com/2020nlpsurvey/?utm_source=ben&utm_medium=linkedin&utm_campaign=nlpsurvey 2020 NLP Survey Report – Gradient Flow The Natural Language Processing (NLP) Industry Survey was an online survey which ran for 41 days (July 5 to August 14, 2020). A total of 571 respondents from more than 50 countries completed the survey. gradientflow.com
-
22 September, 2020 07:17
https://blog.insightdatascience.com/contextual-topic-identification-4291d256a032
-
4 Python AutoML Libraries Every Data Scientist Should Know
https://towardsdatascience.com/4-python-automl-libraries-every-data-scientist-should-know-680ff5d6ad08
-
GitHub – deepset-ai/haystack: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace’s Modelhub…
https://github.com/deepset-ai/haystack/
-
20 September, 2020 20:08
https://medium.com/fintechexplained/advanced-python-metaprogramming-980da1be0c7d
-
Deep Learning on Graphs
http://cse.msu.edu/~mayao4/dlg_book/
-
GitHub – twintproject/twint: An advanced Twitter scraping & OSINT tool written in Python that doesn’t use Twitter’s API, allowing you to scrape a user’s followers, following, Tweets and more while evading most API limitations.
https://github.com/twintproject/twint
-
Unsupervised Meta-Learning Is All You Need | by James Le | Cracking The Data Science Interview | Sep, 2020 | Medium
https://medium.com/cracking-the-data-science-interview/unsupervised-meta-learning-is-all-you-need-71b6dfa29ccd
-
18 September, 2020 07:25
https://towardsdatascience.com/pycaret-2-1-is-here-whats-new-4aae6a7f636a
-
Streamlit. The fastest way to build data apps
Streamlit’s open-source app frameworkisthe easiest way for data scientists and machine learning engineers to create beautiful, performant apps in only a few hours! All in pure Python. All for free. https://www.streamlit.io/ Streamlit — The fastest way to create data apps Streamlit’s open-source app framework is the easiest way for data scientists and machine learning engineers…
-
AutoPlotter – autoplotter is a python package for GUI based exploratory data analysis. It is built on the top of dash.
https://github.com/ersaurabhverma/autoplotter GitHub – ersaurabhverma/autoplotter AutoPlotter. autoplotter is a python package for GUI based exploratory data analysis. It is built on the top of dash. Installation. Use the package manager pip to install autoplotter. github.com
-
Pegasus_Paraphrasing – Colaboratory
https://colab.research.google.com/drive/1RWvGuHKnPur7fCL0DObMeZXQVHem6aEV?usp=sharing
-
GitHub – mingrammer/diagrams: Diagram as Code for prototyping cloud system architectures
https://github.com/mingrammer/diagrams
-
Google TAPAS is a BERT-Based Model to Query Tabular Data Using Natural Language | by Jesus Rodriguez | DataSeries | Sep, 2020 | Medium
https://medium.com/dataseries/google-tapas-is-a-bert-based-model-to-query-tabular-data-using-natural-language-2435a386b43f
-
GitHub – huggingface/datasets: 🤗 Fast, efficient, open-access datasets and evaluation m etrics for Natural Language Processing in PyTorch, TensorFlow, NumPy and Pandas
https://github.com/huggingface/datasets
-
GitHub – huggingface/pytorch_block_sparse: Fast Block Sparse Matrices for Pytorch
https://github.com/huggingface/pytorch_block_sparse
-
DeepSpeed: Extreme-scale model training for everyone – Microsoft Research
https://www.microsoft.com/en-us/research/blog/deepspeed-extreme-scale-model-training-for-everyone/