Main

Django's Data Science Makeover: Integrating D3.js and Bokeh for Data Visualization with Drishti Jain

Data visualization is an essential component of data science, and web-based data visualization is becoming increasingly popular. In this talk, we will explore how to integrate Django, a popular Python web framework, with two of the most popular data visualization libraries, D3.js and Bokeh, to create interactive and dynamic visualizations for your data science projects. This talk was presented at: https://2023.djangocon.us/talks/django-s-data-science-makeover-integrating-d3-js-and-bokeh-for-data-visualization/ LINKS: Follow Drishti Jain 👇 On Twitter: https://twitter.com/drishtijjain Website: https://drishtij.github.io/ Follow DjangCon US 👇 https://fosstodon.org/@djangocon https://twitter.com/djangocon Follow DEFNA 👇 https://www.defna.org/ Video production by the presenter and DjangoCon US 2023 volunteers.

DjangoCon US

3 months ago

[Music] firstly hi everybody and today we have been learning a lot over the past 3 days about D Jango and the wonderful capabilities that Jango has but equally important is to acknowledge the data science world that's happening around us it all the buzzword about machine learning llms including all of that in our applications so why don't we learn things such that we are able to use the capabilities of a few other languages as well and integrate it well into our Jango applications as I dive in I
'd just like to introduce myself so I'm a software developer I'm getting started a lot into the machine learning spaces as well uh I have worked at product based organizations as well as in the fintech space worked at Adobe and City and I'm also social entrepreneur So currently I'm based out in the US since a few months but otherwise I'm originally from India and back in India I have a nonprofit organization that works across 11 cities we are a 400 plus volunteer strong team and we work across e
nvironment education and Healthcare I'm also an international Tech speaker so I like to share whatever I do with the community around me I get to travel the world share my knowledge with developers across the world so you can find me at different conferences sharing about different projects I'm working on my learnings and any and everything that I find interesting in Tech I'm also a career coach so if you scan the QR code you'll also find a link to my website which has my calendar so if you're l
ooking to advance in your career make a transition from a non-tech background to a tech background all of those details are listed uh on my website coming to d3.js so in JavaScript D3 is a library that allows us to work with a lot of visualizations and the best part about d3.js is it does not give you a fixed template you get to code each and everything about your chart as and when you want it so what this gives us is a a raw layout and you can customize each and everything you want about a visu
alization to match with your expectations so it is not a single monolith rather it has about 30 discrete libraries that you can include in order to generate charts now these charts could be related to a machine learning model that you have created how would you work along the machine learning model to display data correctly or if you're working with real-time data you would want a very different representation and you would want it to be very interactive for the user in any of the dashboards uh
in Django that you build and it does not have a default presentation of your data so it is not like a few of the Python libraries that we have that there has to be a set uh number of parameters and it will the display would be uh kind of fixed you would just modify the values change colors and things like that you can actually modify and customize the chart according to your needs it also uh does not include uh by default have a new graphical representation instead with SVG and canvas you can in
tegrate it such that everything is customizable and for any application which uses maximum uh expressiveness for your visualizations D3 is the way to go and if you ask me how do you actually have the two worlds of D3 and Jango combined together you simply include it in your script tag and that is what helps you do it we'll in a couple of slides we'll also see an example of working and customizing a chart on our own to show you the different um capabilities that are possible with d3.js and Django
consider a plot like this this is a scatter plot with different shapes so this is the basic um uh data set of the iris uh FL and the different uh three categories the three species that are out there now instead of representing it as normal dots I wanted to have a more um different shapes to represent different species so that is something I can customize or if my application requires me to show what part of the world is it nighttime and daytime so I have very quick ways to use the 30 inbuilt l
ibraries and build on it customize it to quickly create visualizations like this or when I'm trying to show difference between two two quantities so this is temperature being hot and cold between two cities so with d3.js it is very easy to add the logic you are in charge of any and everything that you want to create you are not restricted by the libraries parameters or the set output to have the kind of data that you want and interestingly even Contours Contours could be uh Contours are somethin
g that's like very important when you're working with machine learning projects especially when you dealing with clustering you would want to have different features showed in the form of um like volcano Contours so let's look at an example of how would you actually create this Contour so to come into the code if you see you'll uh realize that for the chart the width height all the SVG parameters are possible to be edited you can also have uh what do you want to append it are you going to take o
ut the mean of it are you going to take out the mean medium mode what kind of statistical measure are you going to have such that each of your Contour is equally spaced what is the distance you want to show in the black outline of the Contour we saw on the previous slide so and if you realize if you think a bit more deeply about um this picture you'll see that there are two major divisions one is through the colors so colors could represent different kind of concentrations maybe that is your tar
get cluster that you want to display or each of the contour lines what should be the difference should it be only based on your data would you want to have something more to highlight a particular part of data so all of this is possible to be done right here in your code so everything is um editable it's not a set fixed template that you have to follow and once you define your chart then data is wherein uh you add in your data as ajacent format so you'll simply import it um in your chart and as
soon as you do this for the final plot in order to display the plot this is how you will add your data dot values and your width height fill stroke so stroke is black the outer black lines and U with the height width you can identify the complete plots width height so this is helping you to actually go ahead and customize the Contour as much as you want and once you have all of this created you just include it in your Jango applications like we saw in the previous slide and that's it you're good
to go you can include it use it in your Jango application and this part of your Jango application would be powered with d3js now consider a case wherein you are working with more real-time data so maybe it's related to sensors you have a sensor that's continuously giving you a lot of data you need to have constant real time um working of these uh of the data in your application in that case for modern web browsers another um library is the Boke Library so d3.js can be used more than wherein you
have a model already there and you want to have those visualizations to be put up very well in book a whenever you are working more with a web browser based application and if you're more comfortable with python over JavaScript then you can use this Library the major advantage is that your Bou Library codes along with Jango it's very easy to share so you can publish it as a web page or even in your Jupiter notebooks if you're working in a research based uh environment or if your project is stil
l in the R&D phase you would want to share your project with your colleagues now this could be this is mainly in the form of jupyter notebooks so in that case bookie becomes more advant vantageous for us and it is interactive uh and very powerful so even inside Bou you can still use a lot of JavaScript capabilities and these capabilities will help you to power bouquet visualizations much more so uh this is one of the examples so if you see on the left side we have different parameters that can b
e modified in order to generate different parts of the graph so anytime you want something that is interactive you have a number of fields and you're still trying to extract the Insight from data so once we have gathered data a key aspect of working with data is to generate insights from it and it is not necessary that as soon as you have collected data or you have applied some sort of machine learning algorithm you will have the insights there's a lot of back and forth that goes around to gener
ate the actual Insight from data to make some meaningful business logic out of it and that is wherein an interactive plot is very very useful and Bou helps you do that in a very easy efficient way in order to integrate Boke with Jango um your main HTML file would include uh these two aspects in your head and your uh body so your script tag and your um Boke CSS so your CSS has to be added as well as um in your main file your JavaScript version of Bouy library has to be imported so once you have t
his having visualizations like this is very useful uh this is used a lot in the industry especially when a lot of data analysts are trying to understand the information that you created in your data science or machine learning project so data analysts tend to work with kind of not the final output but a a raw an output which is not raw but more processed so using this semi-processed data to generate insights they would want to play around with the data to see if there are additional insights gen
erated so anytime any project is targeted towards making the business use case to be more Revenue generating there could be additional insights and that is where in interactive visualizations play a key role now coming to an important aspect that I just talked about so whenever you have realtime data streaming how do you take care of that this could be related to sensors um it could be related to conferences so for example at Jango con we are generating so much of real-time data that could be re
lated to our social media posts related to the slack Channel there are a ton of uh real-time data that is being created so how do you handle this in Jango in Jango we have Jango channels you simply do a pip install for channels and Jango channels is ready for use now what is the need of having Jango channels in um at its core Jango is built around the single concept of having requests from the browser going to the view function fetching the response and returning it back to the browser this is a
t its score how Jango works but the drawback is that there's no way of keeping uh a connection open and it only works when you have like a request response kind of a pair it does not work if you want to keep it open or you want to return back something when you don't have a request that is wherein D Jango Chan Chann comes uh for our advantage so uh channels allows Chango to support web sockets in a way that is very similar to http views and channels also allow for background tasks to run on the
go and in the same server the HTTP requests continue to function as you would normally work in ajango application but also they get rolled out uh two channels and once you have the channel layer in Jango the this is how the architecture looks like so it is an additional layer that your Jango application works with and with using websockets you can still handle continuous streaming of real-time data that is happening in your application now Jango channels work across a network they allow producer
s as well as consumers to run transparently across machines and they make use of a concept called task cues so task cues actually help you send uh messages uh through the producers to the consumers and back again now all of this makes it very very usable for Django to handle real-time data and once we know we can handle real-time data we are all happy because handling real-time data is the future and we observe it in any and every task we do along with real-time data comes a lot of Time series d
ata so every data that is generally generated uh in a fast moving world which is continuous real time is generally time series data there is a time stamp associated with every kind of data and handling time series data is very crucial it is important for making any kind of informed decisions now these decisions are important to identify anomalies to generate new trends analyze different trends that are happening and all of this is possible with the capability of D Jango channels and a key aspect
also with handling real-time data is any data science project or any machine learning model that you generate having the capability to handle real-time data and also have a feedback loop in your machine learning model to handle the feedback that is generated while interacting with the real-time data you're getting is the key of achieving uh great results by your machine learning model so now Chango channels is very interesting and it is a bit Advanced so why not we try our hands and let's creat
e a uh Jango channels application so so the first thing is you would uh pip install channels you would configure the asgi application in the Django project settings now the next part is your asgi python file you would import um the required libraries and you would set your default settings and U the Nom cature I'm using across in developing the application is so that if you try this back home you can simply replace like your project name to what project you're um trying to build and in your appl
ication you would have a protocol type router this is the way you would route your information so all of your routing information goes inside this block the next part you have to create is the simple uh websocket consumer so you would have your web uh you would have your consumers. py file and you would um mention everything about what happens when it disconnects how does it connect with it once it receives a request is there a message you want to send is there something else that you want to se
nd all of that logic comes in the receive part and once you have this you come back to your ASG file and you would include the URL router for the for the web socket you would also mention the different endpoints that your uh consumer web socket is going to have now these end points are important because you or the user would be accessing uh and generating realtime data generating requests through the end point the next important part is to integrate Jango Channels with views so we created the fi
rst part of the requests handling and the response handling now how do you combine Jango Channels with views so uh you'll create a views.py file and this file is what will actually happen What is the kind of data that you would send back to the customer uh to the consumer so for example right now I just said hello from Jango con us 2023 so that is what I want to send back as my message whenever I get some real-time data you can modify this to also uh have some visualization you can include inbui
lt libraries or use an integration with d3js and Boke it's up to you based upon your application requirement and once you integrate this uh you need to also specify the routing configuration to handle the message type in the consumer so this is wherein your message type could be different things is it just a Jason dump of say text that you want is it something else uh also interestingly you can do a quick tweak here your Json dump can actually have insights like statistical measures that you gen
erate as a part of your logic and you just give it back as a Jason dump instead of doing the processing later on this way it is very easy for you in the visualization part all of the complexity is removed from there and while returning back the message itself you are generating everything in the form of adjacent you just keep appending to the file and this dump is what in what you just give back and as soon as you give this back uh you done we have our Django application ready and you can just d
eploy it um to any asgi uh server such as a hypercom and that's it you're done you you have your application ready which is capable of handling realtime data also an interesting application of realtime data is the chat that uh consumers have on on your web application so there are a lot of insights that can be generated mainly a chat application has a request response type so you can handle it uh without channels as well but with channels it is also possible that you're still waiting on a reques
t bag but the the person interacting with your chat is like continuously posting things or maybe you want to also take care of the clickstream data the person is having so if they are on your application while they're chatting they're also scrolling there's a lot of clickstream data being generated maybe they are browsing different pages so you also would want to capture that to understand what the user behavior is even in that cases D Jango applications make it um very very interesting now we k
now how to handle real-time data and all of this was a part of bringing the machine learning world into Django in a very very easy way so the most important part of whenever we are dealing with machine learning applications and bringing it to the Django world is to find the correct Harmony now when we think about machine learning it might seem that it is a bit complex it has to be interactive and it is ever changing so earlier we would say look only toward supervised unsupervised models now we a
re looking at Gans at llms so all of this landscape is quickly changing and a key aspect of keeping our application up to date is to have your machine learning model but also keep your visualizations to be more interactive and to be everchanging so you can directly call machine learning models into your Jango application utilize D3 or bouet in order to have visualizations so you can have an interactive dashboard which helps the user to understand more about the kind of data they are generating y
ou can also have uh your data being uploaded directly and this data to actually be used in order to create the visualizations that the customer wants so you can create a channels uh a Jango channels application have different data being ingested from different sources handle it as soon as you import it and then work with different visualization techniques now there's a reason why I emphasize so much on visualizations so when you have tons of data it is very difficult for you to find out insights
just by looking into Data or by having different uh row modifications column modifications to understand the Insight once you have the data visualized especially for anomaly detection it will be very interesting to see it visually and you'll actually Identify some clusters or some data points that don't follow the norm now these anomalies could also be something that you had not expected maybe your data cleaning was not good that's why you have a point that is uh coming outside like has a very
different value as compared to the rest of your data now one way would be to actually go back and check back everything in your table in your data modification or if you have visualized this it is very easy to identify oh this was a mistake uh during the data cleaning process which could actually have a very bad ripple effect if it is not handled so having visualizations also helps you to correct things which otherwise would go unnoticed so make use of the great interactive visualizations we hav
e and when whenever you're dealing with with real-time data try and make use of channels channels are a life savior it's very easy to implement um just like we walked through the application as well it's a couple of python files and you can use this as a boiler plate code you can add in your um actual application logic change it a bit and it will work so handling real-time data will become very easy so towards the end I would just say visualize the world and power of Jango make use of all the ca
pabilities of visualizations not just in the end product but even in between to see if you are on the right path or not that's it from my end thank you and you can find all of my socials on the QR code as well and happy to chat about uh things that you're working on or things you are planning to use uh D3 visualizations and Jango channels um and I'm open to collaborations thank you

Comments