OLS 3 Week 6 Open Science I: Project Development

Malvika Sharan: All right. Welcome everyone. This is our another cohort call. Today we have four great speakers who are going to talk about open source development, and Open Science in general. So this is our first module of three Open Science modules. Some of the logistics, again, please describe by writing your name, starting with S or W if you prefer being in a spoken or written breakout room, please make a choice for one to make it easy for us to assign to one room, we have an Otter live tra

nscription going, so you would be able to click on the link on the top of your screen and watch it work - it's wonder. and other things to just take a few notes on that we have a code of conduct that applies to this call. So if at all, anything that you would like to report, please email to team@openlifesci.org. You can also reach out to Berenice Emmy or I separately if you would prefer to reach out to a single person. So there's a huge just like announcement today, we have been running a poll f

or naming your cohorts. Our first court was called Open Sesame (Seeds). O r second cohort was called Mas ed cohort. And now we have P rseverance cohort. So this is your name, you will be known forever as Perseverance o Open Life Science. So th nk you for polling. With that, will hand it over to Berenice o take you through what we' Bérénice Batut: So pretty so thanks Malvika Sharan for e going to c starting the call. So today we will discuss we will it be will be our first call abut Open Scienc

e and we will discuss most the project management skills during development stage. So there will be many. So there will be three calls on Open Science. So this call we've mostly focused on open source software, open hardware software lab, sorry, open source hardware, open data. But it's one yeah, it's the first goal. There will be more aspects covered during the next calls. I think there is a it's Yeah, there are two calls that will happen the next the next week or two. Sorry. So the first call

will be about the first speakers will talk. So Renato will speak about agile and iterative project management methods. So I will hand it to Renato. Renato, you're here. Do you want to share? Malvika Sharan: Yep, I'm here. Let's see if my internet collaborates. Can you give me permission? Yeah. Can you track it? Unknown: Give it a try. Okay, Malvika Sharan: can I get some Thumbs up if you can see this? Bérénice Batut: Perfect. So you have now run 10 minutes presentation, and then you have some di

scussion questions. So feel free for the participant to drop your questions in the document. Okay, thanks. Malvika Sharan: That's all right. So Hi, everyone. Um, so first of all, let me just open this by saying that I am not an Agile expert. These are just some techniques that I that I tend to use, often on my day to day. And, and we will explain a little bit what this is and how you can benefit from using this. So first of all, what is Agile. so Agile was a bit of a principle or a framework to

organise and structure projects, it's, let's let's, let's call it a project manager, or project project management technique. If if you're one of these people that tend to have a lot of post-its around with tasks to do and so on, you're probably already even using some of the techniques that the Agile movement inspires. But to perhaps keep it a bit more focused on the point. So Agile is something that started in industry, it was mostly driven by an intention to deliver a good product to to the c

ustomers, but at the same time to deliver that product as quickly as possible, even if in a prototype state. But at the same time allowing to again, as quick as possible, adapt and modify the product to better fit the needs of the client. And so as you can kind of, you can kind of see from from just the Agile software development manifest and this was, as I said, was something that started in industry but specifically in the software world. And, and the kind of like striking point. And the reaso

n why it's called Agile is is in part because of this. Want. I don't know if you can see my mouse cursor Malvika? Can you maybe give me a thumbs up? No? Okay, then let's see if I can do Unknown: this. Malvika Sharan: Can you see it not? Yeah. So, so what I was pointing to is responding to change, this was one of the aspects that kind of motivated to give this name to, to the movement or to the framework, and Agile in the sense of being very fast and kind of responsive to, to requests and changes

. And so you might, if you, if you look a little bit into what these, this framework consists of, or if you compare it with other kind of approaches that people use in the field, you may encounter these these words, Waterfall. And you might find also, other names for for this kind of iterative process. There's many, many different ways that you can structure this. And I'll explain a few in in a second. But if you just look at the waterfall kind of concept, you can, you can see that it it falls f

rom one stage to the next, the key point here is that all of the project plan is kind of defined from the beginning. And you just follow along in a sequence in a sequential fashion. So you can imagine that, if at any point during design, or at any point during the implementation, or one of the later stages, there's something that you need to change about the plan, this process is rather rigid and doesn't really allow this. And so the iterative process is a bit more generic or a bit more responsi

ve, it typically starts by breaking a larger problem into smaller chunks, so that they become more actionable. And exactly what small means. This is what differs a lot between different paradigms, but it can be something that lasts a day, it can be something that lasts a few hours. And then you you do these kinds of sprints, or you aggregate these tasks into milestones. And, and once each milestone is completed, then you reach a stage where where you have like a first release or a prototype of t

he thing that you're trying to accomplish. In the case of your OLS project, it can be a milestone, for instance, you have a website to build. And you can think how to break the that big task into smaller chunks. And it could be the the first article that you write could be the an example of a first deliverable. And so to sum it up, it's a technique primarily for software development, but we can use it elsewhere. And we can go a little bit more into detail of how to actually do that. It's it's gr

eat for product project management, it helps you visualise the work that you still have to do. And also to invite others to join the project as well, because everything is very visible in terms of what needs to be done and what has been done. And then there's lots of that there's lots of variations in terms of how this is structured. And also in terms of advantages over the traditional waterfall method. So So I mentioned breaking down things into tasks and milestones, and so on. And I mentioned

as well that you could aim for slices of one to two hours, ideally not more than a day or two. The reason for this fragmentation is because you want to have a good a good sense of, of progress. It's very often the case that you estimate a task to be one or two hours. And then it turns out that you spend an entire afternoon or something because we get distracted because there's other things that we didn't really think. And the idea here is that the Agile movement will help you structure those thi

ngs that are outside of the tasks that you're doing. And they just become tasks again, that will get picked up later and move to a milestone that you will complete at a later stage. And so to give you a more concrete example, or a real life example, if if you have already explored a little bit of GitHub, and if you've perhaps browsed some of the existing projects there, you might see project like InterMine, where in this in this diagram, each of these grey boxes is sorry for the background nice.

Each of these back each of these grey boxes is a version release. And the tasks themselves are within each of these grey boxes. And in this case, milestones, there are several them, you can see already some estimate of when these would be achieved. And, and, and also a very colourful interface for, for how the how to label things and how to structure not just in terms of milestones, but also in terms of what these tasks are, are all about. And in a in a slightly different way or a more kind of

so one of the cons, one of the paradigms in Agile or a more popular one, if you've used the Kanban style board where instead of having the versions, like are the milestones, as we mentioned before, you have just the notion of what what is to be done, what is in progress and what is already completed. So this is like a simplified version is not so focused on on the software or versions or specific milestones or goals. But it's more to capture what is actually being active worked on. And and you c

ould do this process within a milestone. So all of the tasks that you now see in the screen could be within one milestone alone. This also to say that for GitHub, there's some simplifications and some automation that you can do. And you we can we can talk about that later if you're interested. And so just to wrap up some examples, you could have a task that is just to acquire like in the website context that I mentioned before, one, one big task could be to acquire a domain for that website. And

then you can see an example of how to break that task down into smaller tasks. Or if you have a specific section of the website that you want to create. And then you can see that for that there's perhaps more tasks that need to be completed. And so ideally, you would break it down again, into into subsequently smaller steps. I will skip this for for the sake of time. And I didn't go so much into the jargon that is involved, I mentioned that there's different sub frameworks, Scrum, Kanban, and e

xtreme programming are some examples of these. They all follow the Agile principles, what changes between them is sometimes how big these tasks are, how big the milestones are, how often you you kind of do the larger loop or the smaller loop for how long you do these things called sprints, which is kind of a way of getting the entire group working collectively in a set of tasks. And, and then just how you structure but so without going too much into detail on that. It's just different ways of ha

ndling the workload and prioritising the tasks that you have to do. And, and then to finalise, even though, as I tried to be very superficial here as well, to kind of give a high level introduction to this to this topic. And if you're if you're doing software development, you will find that all of this probably translates a lot more versions, milestones and so on. This kind of makes sense. But I, I use this personally for it for my own day to day, just to manage tasks that I have to do things li

ke reporting or planning meetings, or anything, anything of this sort. And, and I find that it works rather well. Okay, so I think I'm a little bit over time, but I think I will, I will close it there. Bérénice Batut: Thank you for a presentation. Okay, thank you. Does anyone have any question for Renato, so feel free to ask them in the chat or put them in the document? Currently, there is no question there. Anyone wants to ask anything? You have any question? So I could get one. Malvika Sharan:

Yeah. My My question is, is specifically we're not though How can this be applied on a day to day basis? So it sounds like that you always have to do it for a big project. But do you have like tips and tricks for day to day agile development. So I don't, I don't have the screenshot that I could easily show with all the tasks that I kind of have. But what I do is I try to keep it very colourful, which also keeps me motivated. I try to split up tasks by not just milestones, but for instance, what

what kind of topic they are, if I have an a meeting that I want to plan, then I will create a milestone that is that meeting and this could include, for instance, inviting all the speakers or getting in contact with many people. And then you could also have, for instance, a report that you need to write and that report could be also a milestone or a task depending how you break it down. And then within that report You have, let's say bullet points of things that you want to write, or that you w

ant to make sure that they are included. And in terms of the of the day to day process, you can. So when GitHub, you can also have these little checkboxes that you can click within a task. So you can do an outline of lots of bullet points. And and, and you just go when you start clicking as you as you progress through them, at the end of the day, you can then review what you achieved and sort of plan for the next day, what you want to pick on again, and it can be the same task, but just a differ

ent point in the same task or, or it can be a different task, depending on on how much time is left. Bérénice Batut: So are you using just Github for that? Or do you have another tool that you recommend? Malvika Sharan: So I've used Trello in the past, and Trello is very, actually the GitHub interface that was in the screenshots was very much inspired in the Trello interface as well. But But nowadays, I mostly use not GitHub, but Git lab, because we have a git lab instance, running locally. And,

and it also has not just kind of an issue, list, but also these boards that you can kind of move around and you can create, you can make those colourful labels on on the issues, you can turn them into elsewhere, some sort of milestone or category like this to do when pending and done and so on. But, but I find that this is working well, you can also have notifications, so you can set deadlines on tasks and so on. And it sends you reminders when those milestones get get close. So it Yeah, yeah,

I find it very practical, the only discipline that you need to have is to visit it as often as you can, because there is a little bit of maintenance and keeping it up to up and shiny. So to say if you if you don't go there too often, then you can imagine that some of the tests become obsolete. And some of the things also will need to you have to be modified or updated if things changed. Bérénice Batut: Thank you. Other questions? So do you want to verbalise your question? Or do you want me to re

ad it right now? Emmy Tsang: I can do it. Unknown: Thanks a lot for Renato for this presentation is very useful. Malvika Sharan: My question was beyond personal use, have you do you have some tips or an experience making this part of the community culture in a group, for example, I found that when you have heterogeneous group also, it can be complicated sometimes to convince people to use these. So I don't know how much how much I can reveal there. But in in my previous group, we had that that d

ilemma at some point, a lot of the communication and a lot of the core value within the group was was being sort of organised by email. And this had the disadvantage that there was a lot of work being done that was not really visible. And, and at some point, we decided to change this, it was a bit of a group effort as well. And, and we, we did exactly what I described. So we moved to Git lab, we started using the Git lab issues, you could do the same with GitHub as well. And it, it allowed not j

ust everyone to see what was going on, it also allowed a lot of reduction in terms of duplication of work. And, and it, it also opened up the space to to have some kind of automation for some things. So we could have, for instance, if you wanted to do a specific task that that the group already had as a core pipeline, or some software analysis or something like that, then then you could just create an issue. And there will be one person specialised on that that will take care of handling that ta

sk. So it also has this this dynamic of potentially, you could. So what I'm trying to say here is that you can assign people to categories of tasks or topics. And then people take responsibility for those. But it doesn't need to be kind of like a common agreement that that everyone will will share a little bit of the load and in terms of maintenance. Once the once you have a core of people kind of coordinating everything the same way. And this is one of the things that you need to agree upon wit

h with all with your community, then then it runs rather smoothly from my experience. Bérénice Batut: Thanks. Thanks a lot. So thanks, again for historic and for your question. So now we move to the next speaker. So the next speaker is Helena, she will talk about open source software. Helena, you have 10 minutes presentation and after questions. Thanks. You are muted. Helena: I am muted, okay, everyone can see it. Okay. Okay. Perfect. So my name is laner Asha, I've been working in open source fo

r quite some time. Now I'm currently employed by the Erasmus medical centre and the Avans Hogeschool in Breda. And I'm here to tell you a little bit about open source in research. So first, what is free and open source software, the term is a bit important. Because open source software and free software are not necessarily the same things - free software doesn't mean open. So you can many of us remember freeware on the internet from when we were younger. This is software where you have no access

to source code, and you can't work with it, you can't modify it due to the licence. This is not great for science. On the other hand, open source doesn't mean free. So for those of you who are producing research software, just because you're making your software open source, it doesn't mean it has to be free software. And this together, the intersection of these two is often known as free and open source software, or FLOS - free/libre Open source software. And this just refers to all software t

hat is both open and free. Open source is simply licencing, your work so it can be used how you want. Making your software open source is just a matter of setting out the terms of how you want your software to be used, it doesn't mean that you have to give up control of your software, it doesn't mean that you're putting it out to the community forever. It just means you're choosing and saying for yourself setting boundaries on how you want your software to be used. So it's very important step al

l the time. For those who want your open source also makes it easy for others to remix your software to reuse it to build upon your software, add new features if they want. And if you're building a project where community is important, where you want your software to be used by a lot of people, making it open source in such a way that people can reuse and work with you can be a really great boost for your software just in terms of visibility and people who want to use it, things like that. So wh

y use and promote open source. Open source is a part of the ethics. All these are options. And if you want people to be able to use them, you need to licence it. One of the common fears I hear about a lot is, if I publish my software online, if I do it in the open, it'll be I'll be scooped someone will take my software and claim it's theirs. But this isn't necessarily true. If someone steals your software, there is at least a traceable log in GitHub, or which we'll get to in a minute. There is a

copy of your software already online. And if you already have a community around that, it'll be very obvious that someone took it. And if you're still worried, I've heard some people saying that they published preprints, as a way to document that, hey, they were first to write the software, and to really make sure they stake their claim on that software. So don't worry about that just work in the open, it's better for the community, it's better for the world. And it's good for science, publishi

ng, sharing open source code. One of the easiest and most effective ways to do this is version control. I'm sure you're all learning about Git and GitHub if you haven't already. But version control is fantastic way to publish and share code with others. It gives you a whole timeline of your software, and it makes it easy to reuse, contribute and make modifications to your software. So why? collaborating is easy. One of the common things is reverting accidents. If you make some bad changes in you

r code, or one of your collaborators does, you can always revert, you can always go back to before then, it makes it easy to integrate the changes from multiple developers, like with Galaxy with some 200 contributors to the code base or the training materials as well. And all of us can work together collaboratively because we use version control. Also offsite copies of your software, everyone has computer issues, everyone loses a hard drive at some point or gets their harddrive encrypted by some

hackers, things like this. If you have all of your work in the open in public, then you can just download a new copy again and start working in. Git and GitHub are very common, a very common choice. Git is one of the most common version control systems, there are others. GitHub, likewise, is one of the most common Git hosts. But there are others depends on what you want to use. One of the nice things you get with GitHub is a large existing user base and large community of people who will be abl

e to contribute to your software to low barrier for entry. If you need to learn more about Git, there's a great set of lessons from the software carpentries. But one of the important notes is that Git is very, very complex. My partner teaches a get together session where they teach how to use Git to the colleagues in our office. And there's just so much to learn about Git, but you don't need to learn all of it Now. Just start with the important parts, the rest comes later. There, you'll see a lo

t of guides online that will say oh, you need to learn about how the commit craft works and things like this. But if you just want to publish your code, you don't need any of the fancy stuff. So a few steps to make your work open source, A Readme is a very important part of this. If you want people to know what your software is and how to start using it, that's the number one thing people will see. So be sure to include lots of good images there. Licence, if it has, if it's going to be open sour

ce, it needs a licence file. So it takes two minutes to start a licence with GitHub, it's really easy, just doing the same thing. Contributing guide, GitHub has some template contributing guides that make it really easy to tell people how you want them to contribute to your to your repositories. Having a public roadmap, we cannot to talk about the Kanban boards and the agile. Having a public roadmap is a great way to tell community, what you're working on what features you're going to implement

things like that, that can help people get excited for your software. Publishing list of issues same, it feels bad to say, hey, my software has bugs. But at least putting them out in the open, you can track the things you've done or not done. A code of conduct, as I'm sure you've learned from OLS is very important thing. Contact and citation can be very useful as well. If you have a GitHub repository, you can easily get those from Zenodo and figshare. If you need a.. there's a nice website for h

ow to choose a licence. There are lots of different licence choices. And they give you a lot of different freedom to choose what you want your software to be able to do or what you want other people to be able to do with your software. Some people don't want businesses to use their software for free, and there are licences that support this or some people want everyone to use it for free. Lots of different choices. But the ultimate goal, of course, is full reproducibility. And we're getting a lo

t closer with things like Jupiter and binder where you can publish your software but also a notebook where people can run your software online, which is a fantastic way to get people to use your software. Taking it further, there's a lot of ways for you to contribute if you're a first time contributor and to get involved in the Open source software movement and contributing to the open source communities. So there are lots of nice links here if you want to explore them. And lastly is the Turing

way has a nice handbook on reproducible data science and making software open source and publishing and making it accessible to people. So with that, thank you. I think I'm 30 seconds over time. And let me know if you have any questions. Bérénice Batut: Thanks, a great talk. And anyone has any questions within a while about open source software? Feel free to so I don't see any Oops, sorry. Okay. One event that So one question for Elena, if you are new newcomers to open source which events you re

ally recommend to join to get familiar? Helena: Definitely, Hacktoberfest but that's just because you get a free shirt, if you contribute for for request, which is always a good motivator. That's what I contribute to every year. I have not participated in the others like, what was listed there. The Mozilla global sprint or the 24 pull requests, so I can't speak for those. Maybe you know about Mozilla. Bérénice Batut: So but then is it still running the Mozilla global sprint? I don't think so. It

was for a few years, but I didn't hear a knock recently about that. But it was a nice event, global event. But Hacktoberfest definitely, I liked it. Especially because people on GitHub, usually labelling their their project with an issue that easily that are for newcomers. So it's a nice, Helena: really nice thing that people do easy, low hanging fruit issues that people can contribute to. Bérénice Batut: And there is a question, can you share some good tips for open source software maintenance

? How can we get Sorry? Oh, can we get more people learning and contributing to my project? Helena: This is a broad question. Um, maintenance is always a long term topic, and depends on your funding structure, of course. But if you can build a community around your project, a lot of that is made easier by doing these things like the code of conduct and the contributing guide. These tell users how they are newcomers how they can easily contribute to software they give, you know, a point by point

guide, the training materials is a good example here, we say do you want to contribute a training material, then here are the steps you need to follow here are how to set up your environment to contribute. Here are the contributing guidelines. So we need your you know information, we needed to be spell checked things like this, check for the build errors, things that make or that someone can read and say, Okay, I know exactly what I need to do to make a really good first time pull request. That'

s one of the best things that you can do for making your software. Or to start building the community, which will, in the long term, hopefully help people maintain and contribute to your project. Learning about your project, this is going to be a matter of having good documentation. And there are a lot of different types of documentation. There's developer documentation. There is things like the API documentation, if you write software, what parameters for this function call, then there's docume

ntation for like tutorials, which are really a different audience completely, but also a necessary part of onboarding people, there are different steps sort of like, you want people to know how to use your software in general. So they need sort of training materials to get them started with how to instal how to run, what are the commands blah, blah, blah. But then you need also the next level for once people get deeper, you need the the developer documentation, here's how you can contribute here

is the structure of my software, here are the different components that you might be wanting to work with. And then you also need the last one with the API documentation of here are the exact parameters you can call if you want to integrate with this software, that sort of thing. I hope that covers it. There's a framework or good graphic that I found a long time ago that I really liked it. There's just different types of documentation and they're all important. Bérénice Batut: Just not afraid,

everybody, you don't need to have everything from scratch there. You can build that with your community and boarding the people to help you writing this documentation. It's on aspect there where you can get your community behind that because it's can be a bit frightening when you when you think what everything you need to do, just to be sure. Helena: Yeah, very good point. Thank you for emphasising that it's really you can build it up over time. It doesn't need to be from day one. Bérénice Batut

: And mentor aspect. So empowering the people so mentoring them through your your programme that they form new people that can become more and more. Yeah, I think it's also there. So I'm answering your question. So what is the next question? So how can I convert a non open source repository to open source? It's the last question then after we move on. Okay. Helena: The first step of this is usually to make sure you remove any secrets from the repositories history. A lot of times when people have

closed repositories, they hard code, things like database passwords, make sure you strip all of those out of the history. And then once you've done that, you can work you can just make it open and add a licence. Adding licence is really the only thing you need to do to make it open source software and publishing it on somewhere like GitHub. After that everything else is extra, its decorations on the top. It's nice decoration, but it's not necessary. Bérénice Batut: Thanks for the quick answers.

And then I hand in to Emmy, you know, thank you. Thanks. Emmy Tsang: Thank you so much. Very nice. And then next, we have Esther. Hi. Oh, hope you can hear me. Perfect. Thanks. I hope you also see my slides. Perfect, wonderful. So hi Perseverance, I am very excited to be talking to you today about open data, which is one of my favourite topics. So my name is Esther Plomp. I'm a data steward at the Delft University of Technology in the Netherlands. And I'm also one of the mentors of this cohort.

And you can find the link of this presentation in the slide itself. And I should have also pasted the link in the notes. So do let me know if you cannot access it. So open data. What is open data? Open Data is any data that's made freely available for use and reuse by anyone and everyone. So what does that mean? It means that everyone should have access to the data. So it should be available on the internet, for anyone to access on demand. Everyone must be able to use, reuse, and redistribute t

hat data. So participation is also very important here. It should also be transparent what kind of information or what kind of open data you're accessing. So there should be some information about data generation, and collection and about the data. In terms of reuse and distribution, that's very important for open data, you should be able to redistribute data without a lot of restrictions. And we'll get back to that later. And open data should also be interoperable. And that means that you can i

ntegrate it or link it to other data. And if you do that, in a machine readable format, it's easy to do that automatically. So these definitions are from the Open Data handbook and the World Bank and I pasted those things in the slides. So do have a look at those websites. If you want to learn more about the terminology. What is open data and not? or? Yeah, just to highlight that the sentence "data will be available upon request" is not a good sign in terms of open data. So I'm sure we've all se

en this sentence in a publication where the author states that sure if you email me, or if you call me, I will provide you the data. So it does not open because you can't actually access it by yourself, you would have to go through the author. And a study by Vines and colleagues has indicated that is not as easy as it sounds, you can email researchers. But because we tend to hop institutes, we change our email addresses quite frequently. So a lot of the time, it's going to be difficult to contac

t a researcher. So their conclusion in their paper was then also that research data cannot be reliably preserved by individual researchers, which might sound a bit harsh. But to be fair, I don't think we should be asking individual researchers to preserve data for the long term. That's why we have data repositories. What is also not open data is FAIR Data. So this is a term that's been used increasingly in the data world, but it's actually not the same as having data that is open. So FAIR, means

findable, accessible, interoperable and reusable. What does that practically mean? I put some links into the slide where you can look it up in detail but I am going to do it very briefly right now. So findable means that your data is findable on a data repository with some metadata. So information about the data, and a persistent identifier. And this persistent identifier ensures that you do not end up with a broken link. So that means that your data is going to be accessible as well. So access

ible is seems kind of like it's open. But that's not actually what's accessible in Word FAIR is about. accessible means that there's a procedure in place that allows you to obtain access to the data, potentially, your access can still be declined, that there should be a procedure in place. So this is where it's different from open data is not necessarily open. It can be, but not necessarily. But here also, we see that interoperable plays an important part of FAIR and your data is interoperable w

hen you use open formats. So formats that anyone can use such as PDF. Instead of Word documents, for example, not everyone has a Word licence and can just use that software. But also, if you use commonly use vocabulary, or standards to describe your data, that means it's more easy to integrate it with other data and more easily understandable for others. The same goes for reusable. If you want your data to be reusable, you should document it well, so that others can interpret the data. And in or

der to do that, it has already been mentioned for software. But the same goes for data, you can set up a Readme file, and I put put two examples of README files in the presentation. So do please have a look if you want an example. And also, there are things such as data dictionaries, or code books, there's a little bit more elaborate in comparison to the readme file. But it's also very good practice to document your data so that others can actually reuse it. And then what is also the same for da

ta as for codes, in order for it to be reusable, you need a licence. And for data the most commonly used licences are the Creative Commons licences. So we do use different licences than software. Because data and software are different objects, it functions a little bit different. So I would not recommend using the Creative Commons licence for software. But for data, this is standard practice. And the licenced use of our software was already linked before this, I'm actually linking to the same o

ne because it's a very nice website. But for data, we now also have a licenced user. So you can go to the link in the slides to check that out. So just briefly, also introduced what a licence is. it is basically a sort of standardised contracts that tells anyone what they can do with your data. So they don't have to ask you what you allow them to do with it. And as actually why it's very important to have this licence out there in the open. Because if your data or your software doesn't have a li

cence, it means no one can actually use it without asking you. So this is why it's very important to have that licence. getting distracted by the chat. I'm not looking at that right now. Sorry. We'll get to it later, hopefully. Right? Open Data has CC 0 or CC BY licences. This means that there's no law of restrictions for re-use of your data. So for example, cc zero or the public domain allows the re-user to do basically anything with your data. And they don't even have to attribute you. So they

don't necessarily have to cite you. As good practice, though, so would recommend to cite your sources. But if you would like to enforce that a bit more, you could use the CC BY licence. So that allows the same things as you can see in the slides as the public domain. But it also ensures that every user should, should cite you or credit you. And then the further down you go with these licences, you see that they get a little bit more restrictive. So for example, the CC BY non commercial non deri

vative, the one at the very bottom of this list, that one does allows you to copy and publish the data but it doesn't allow for commercial use, and you can't actually modify and adapt it. So this is not really open data in a sense because there's a lot have restrictions that are placed on the data. But in any case, if you want to choose that licence that is, of course, up to you. And also just a highlight here. If you're not sure what licence to choose for your project and you want to share data

, please also feel free to contact me if you just want to ask or just have a chat about it. And then something which I find very important is that it's not just about the access to the data and redistributing the data. So I pasted two of my favourite quotes regarding Open Science on the slides, which is that there is no Open Science if science is not open to all. And that's a quote from the blog post "bropenscience is broken science", and can recommend to read that one. And also the quotes, incl

uding more ways of knowing, and understanding our common World Within the great scientific conversation would enrich and diversify its collective ideas and creativity, for the common good. And that quote, is from Open Science beyond open access, and that's a report, which I also highly recommend that you read. So it's really not just about having access to things. I think open data is really about anyone allowing to participate in the creation and generation of that data as well. Because otherwi

se, I don't think it's truly open. And if you want to learn more about that, there's the data equity framework by we all count. And there is a webinar on data justice talk story, which I recommend and a book on data feminism by Catherine d'Ignazio Ignacio, and Lauren Klein. So they go a little bit more into this topic on why it's important that everyone can participate. And then I also listed some additional resources. Paula Andrea Martinez gave a similar presentation to this one last year for

cohort two, and it's a really good presentation. So do have a look. And if you want to really participate in open data, and you want to get more involved, I can recommend joining the research data Alliance, which is a global research community that really wants to increase awareness and facilitate sharing of data. So this is a great way to get in touch with colleagues, and get involved to initiatives and working groups and do some really practical stuff, as well in terms of data sharing, of cour

se, the Turing way, already been highlighted. But it has an open research or reproducible research chapter. So you can also find more information there. And the open data, open data Handbook, by the Open Knowledge Foundation. And then if you you're running out of time, and you don't want to read full data, books, and presentations and all these things, I listed two blogs- 10 arguments against Open Science that you can win by Malvika. And I'm not necessarily saying that you need to be an open dat

a evangelist. But if you're hesitant about sharing your data, this is also a good blog to read, and maybe reconsider why you're hesitant about this. And I also wrote a short blog about how can you make research data accessible? So I'm shamelessly plugging that. And this blog is a bit more practical in terms of how you can actually make your data open. So we can't really go into that right now. Because of time restrictions. But if you have any questions regarding that, please let me know. Yeah, I

mean, your Slack channel, I'm one of the mentors. So I think that's the easiest way to find me. But I'm also on Twitter, or you can send me an email. And I think you might be able to attack me on your GitHub repos, not 100%. Sure, but happy to answer any questions there as well. then that was it. Thank you. I know you have to run soon. So I'll keep it short. Folks, Please leave your questions in the Google Doc. If we don't have time now. Shall we have one quick one. Alexander asked. Do you have

any experience with the Open Data Commons licences? Hope Open Data Commons licences. Yeah, let me look at Google Doc. Sorry. Is that do you mean the same thing as the Creative Commons licence or is this a separate? Let me check the website. It's from the Open Life Sci. Yep, sorry. Unknown: Not me. They are they are very, very similar, but not not Same, these licences are not for data. They are not very useful. But I think they are more focused on on data on open data. Emmy Tsang: To be honest,

I haven't heard of them. So I'm happy to check that out later. But yeah, there's multiple licences and just a Creative Commons licence. So yeah. Thanks for sharing. Thank you. We learn, we learn new things every single time we have this, which is great. I'm sorry, folks, I think Esther so you don't, I'm assuming you have to run. I can do five more minutes. So unfortunately, I have to go to another workshop. So I can't be in the breakout rooms. But again, please do contact me on slack. If you wan

t to have like a second pair of eyes going over your data or your licence or whatever, if you have more questions, or if you're watching the recording. Okay, let's do it. Let's do as many of this as we can before, you have to run. understand that there isn't enough to make data open. But isn't it a requirement that the open data are FAIR. preferrably? Yes. Because the FAIR principles really do. Yeah, increased reuse, and they're not really. They're not really inhibiting each other. So if prefer

ably, yes, but the other way around, then yeah, doesn't work so much. That answers the question. Thank you. And let us ask, where's the best place to store your data in the public databases to make it findable? Yeah. So there's, there's a bit of debate, what is of course, the best place. So if you have a disciplinary specific repository, I would recommend you to deposit your data there, because that's the place where your colleagues are going to look for the data. So that's the best place to do

it there. If there's no disciplinary specific repository, which is very highly likely the case because not every discipline, or every subfield has its own repository, then you can use a general repository, and one of the examples is Zenodo, but also figshare. And there's Yeah, there's a couple of others. In that case, I would recommend to use a repository that really assigns these persistent identifiers, because that makes your data more discoverable. And also, it allows others to cite your dat

a. And it's persistently available. So those are, I think, the two main points to pay attention to when you're looking for a repository. And I can share some resources in the notes about how you find a repository. I think it's also in the blog, not 100% Sure, but I'll share it. Thank you so much. Alright, folks, if you have any more questions with that, thank you so much. And since then, hopefully, we now have breakout rooms. So going back a little bit to the first talk that we had today on iter

ative project management and design. So the breakout room will be 10 minutes. And just a small reminder, before we start, if you haven't already done so please, we really appreciate it if you can rename yourself with a W or an S before your name. So we know where to put you Malvika Sharan: can actually Emmy Tsang: 10 minutes, there are sort of like a sort of a two part task. So the idea is that we would like you to break down your first milestone into achievable chunks. So that's what you're goi

ng to do for the first five minutes. Working silently on your own. You can use the Google Doc to take notes as we do this. So that's from page 10 onwards, you can see the status of your notes. So after five minutes of breaking down your first milestone, share it with your group, what you find interesting, and or challenging. So if you are sharing in a spoken discussion, then you can of course, speak freely amongst yourself. If you're sharing in a written discussion, instead, we'd like to ask you

to please keep your conversations written. This could be done either in the zoom chat, nobody else would be able to see it other than the people in the room. You can use the Zoom cat or on Google Documents when everyone else will be able to see it. I hope that's clear. And I think you're ready. Malvika Sharan: Yeah. So the spoken rooms are room one, two, and three and rest of you all are in the written room and I'm opening the room here. It'll Emmy Tsang: pop up on the screen now. I think we're

all. So hope you all had a good discussion and some interesting experience breaking down your first milestone into achievable chunks. Does anyone want to sort of verbalise some of their thoughts on what you found interesting or challenging in the process? Please feel free to unmute yourself. Yeah, go ahead. Unknown: Go morning, afternoon, evening. Um, I think one of the challenges which I had on my first milestone is really learning what I did not know. So I think going into it, oh, GitHub, I c

an do this business. And then it opened up like 20 other doors of more questions and excitement and being like, oh, there's a lot more parts, which I really need to fill in to make this a complete project. So I think the deeper I get into these milestones, the more I realised how much I could actually flesh them out. And that's tough, but rewarding, I guess, more work. Emmy Tsang: Thank you so much. Yeah, no, I feel you. If does anyone have any tips to share as to how to make this a bit more man

ageable? Oh, it is anything that you find challenging or interesting as well, in that process of breaking things down? I'm checking the notes as well. And I see that folks have mentioned, you know, it's a bit of a chicken and egg problem that you have, you first have to have an idea. But then, without the user's needs, you can't really have the idea. So it's also a bit of that that iterative process of going back and forth between the problem and the solution and the problem. And so, yeah, it's

great that you notice that that's, that's really great to see that. I hope you find this process, rewarding nonetheless, despite the fact that it could feel like a little bit of a circle at some time. But I assure you, you're moving forward, and in many ways that you don't, you're not aware of at this present time. But yeah. So if you have some time, yep. Unknown: I would like to make a comment to that problem. Also of like, learning all these new techniques, intuitive might be a great way of on

boarding new people to your project, right? Maybe you know, somebody who's an expert in your department that already knows GitHub, and what seems to be like a big task for you. It's like a five minute job for them. So you make this very low entry barrier task for them. And you get them hooked into a project. Maybe. Emmy Tsang: That's a wonderful t tip. Thank you, Andre. Yeah, it's also very satisfying to be able to help someone so do pass this wonderful feeling on to other people. Anyone else wo

uld like to say something about about their experience. breaking things down. Not afraid of the awkward silence. So let's do that for another 10 seconds. Okay, if not, we can. We should go to the next section. But please keep thinking about this. Please keep trying, please keep sharing your experience. I now pass to Malvika. Malvika Sharan: Great, thanks so much Emmy. And I'm very glad that Andre already spoke. So it makes it easy for me to introduce Andre. Andre is one of the open hardware rela

ted mentoring programme which is very similar to OLS and we generally call them our sister programme. He's a co founder of that and he's here to share with you what opens up what open hardware means. to you, Andre. Unknown: Hi, everyone. Thanks for the introduction Malvika let me just find my should have done that before. Find my slides, which i have actually shared on the document already. Yeah. So Hello, everybody, thanks for the space and time to talk about open hardware, which is something t

hat I'm really passionate about. And I guess I think we start by taking things one step back, because we spoke about all of these wonderful initiatives in in science, right, we spoke about open data, and open core open source software, which is more known to everybody. Where's my screen? Here, I hope? Can everybody see my presentation? Okay, cool. And so the point that I'm saying, like trying to make a step back is that without hardware without equipment, there is no science in a way, right? Lik

e we can download all these amazing data sets. But if you really want to be in control, of generating data, and like doing answer exciting questions locally, and questions that are important for your science in your community, let's say, then you need the proper tools to do it, right. Malvika already introduced me, but Hello from my side, so that I get started. I work at the University of Sussex, where I have a job that really that I really, really like, which is developing open source hardware

For the Department of neurosciences. I'm also volunteering a lot of my time to train in Africa, which is an NGO, trying to help development in higher education in Africa. So we do a lot of sharing of our knowledge so that people can actually build their own tools. We do a lot of workshops, I started a small website called Open neuroscience, and I'm working with Julieta Arancio. And Alex Kutschera, as I mentioned, on this programme, that it's similar to Open Life Science but for hardware called O

pen hardware makers. All of this started about I mean, I think already spoke about like, why open hardware, right, like we need hardware to do the experiments, the things that we do. And here's just one example in life sciences. Right. So if you are thinking about microscopes, which is one of the workhorses inside Life Sciences, right? These tools are from they were first developed, like in the late 1600s, I think, if I'm not mistaken, and this one that you see on the left is actually from 19, t

he turn of the 20th century. So 1904, or something, and the one on the right is a little bit more recent, as you can see, like the modern microscope didn't change much over the last 100 years. And still, if you want to get one on the right, you would spend something like 10,000 pounds, right? And I mentioned costs, because although we really like to romanticise things about science in the end, like we still need to pay for things. And so these things are really expensive, although the technology

in them didn't change much over the last 100 years, right. So there is something wrong there from my perspective, but even if you don't consider that if you think about you are let's say in the global south where, I'm originally from trying to buy a microscope, you only find distributors from the global north. So you have to import things, they only have they are the most they have their clients in the global north mine. So if I bring it back to my hometown, in Brazil, where it was super hot, w

arm and humid, then like this thing is gone, like rest really fast and probably be out of work. If it breaks, I have no local support, I have no idea what's going on inside because it's, it's a black box. And as COVID-19 has shown us our supply chains, for equipment for for a consumables and all like they are really fragile, right, like once like, there is a disruption from like China and so on, like, everybody gets stalled, and things get delayed, and so on. And so all of this together makes sc

ientific equipment really slow to innovate, right? Because you cannot open like, I'm not gonna fiddle around with a 10,000 pound piece of equipment, right? Like, if I break it, like the whole department is gonna want my head. So like, I better not fill around with that, right? And so this is why like, without access to these tools, proper access, there is no science, right? And this is where open source comes in. This is just a few examples of open source microscopes. They're available today, if

you want to find these projects online, where you could find them reproduce them right away. And the nice thing about them is that all of them are under $100. Right? Some maybe Actually, let's say all of them were under $200, which is like orders of magnitude cheaper than the ones that I just showed you. And they're all portable. Right so here on the left, you have the scale view, which is about Five centimetres is something we developed in the lab a long time ago. And we're using for education

al purposes and so on. But you could do like 80% of the stuff that you do in the labs, right? This one next to it is actually very specialised for fluorescence, which is like a fancy method, right, like in life sciences. But the one that I would like to highlight or this to one, like, the one left with the yellow one, which is called openflexure, which is an amazing project that I really like, I'm not involved with the project myself, but I really like like the project, because they're actually

made the papers and prove that this actually is able to detect malaria, inside blood cells. So the resolution of this is on par with optical microscopes. And you can build it for a fraction of the cost in 3d print parts, wherever you are actually, a lot of the developers for design in Tanzania are partner developers with the people here in Bath, which is close by to Brighton Ram. And the other microscope on the right is actually about this big, and goes on the head of mice, and they can actuall

y do live imaging of brain activity while the animals are performing a certain task. And what I think is really crucial about these two examples, these last two, the result is that the yellow one open flexure, they actually showed that they can do better, like the same or better performance than actual commercially available microscopes. And the one on the right, is actually providing, which is the miniscope project can actually provide something that wasn't available before. So before the minis

cope, as an open project came along, there was no way to have freely moving mice and record their like brain activity with optical imaging. Right. And so this brings a lot of innovation, there is a big community around miniscope, just to like, pound on the point that the speakers that came before me said that fostering a community and bringing people into an open project actually is good for research is good for you. And it's good for innovation and like speeding up the cycles in science and res

earch. So I wanted to keep this kind of short. So this is just a slide that Juli made. And I adapted to show why we should use open hardware. And I hope I have mentioned this already, even if like too fast and, and giving you a lot of information. But let's like take again, step by step. So because it's reproducible. So we actually use GitHub and GitHub a lot to put out our projects, our open hardware projects, because it's actually super fun to learn how somebody did something like how the mini

scope this amazing project works, right, I can go into the documentation, and actually see how they put the boards together, how the electronics work and so on. It's super affordable. So from our experience with people that are being part of our workshops in Africa, they say, look, this is affordable enough that here in our institution, we can, like put some resources together and get started with open source hardware. It's repairable because I know like everything that is going on. Inside the h

ardware, I can also know how to repair it when something goes wrong. But not even that, because I know how everything works. I also know if the data that I get out of it is reliable or not. Right, like, Oh, is this PCR? Or is this image from this microscope really what it's supposed to be? Or is it an artefact? Right, because I know how it works, then I know the limitations and the capabilities. And because I know the limitations and capabilities. I know where and how, and if I need to customise

something. Right. So let's say I'm using the open flexure for something, but right now it works on power in the in the wall socket. Right, but I needed with batteries, because I know like how it works, I can easily say, Okay, if I change this power supply with these batteries, then I can actually bring this to the field and use it outside the lab space or even in a lab where there are constant power failures. Right. And putting all of this together. This is then obviously, hopefully, if it's do

ne, right, it's democratising. Right, because then it doesn't matter if you are in the global north, or if you're doing citizen science, or if you are inside academia, because you're collecting data with tools that you actually know exactly what they're doing. You can go and say, Look, this is my data. This was how it was recorded. And it's it's, it's good data, right? So we need to discuss data and not whether or not like I have a $10,000 piece of microscope. Andre: Luckily for us, there is wha

t I'm calling like the Cambrian explosion of open source hardware going on. So what you're seeing here are a lot of projects that are currently available. some highlights that I think are really cool, like here on the top left, you can see this little object on the on the palm of her hand. This is an atomic force microscope, right? So this goes down to image like really, really, really tiny stuff. Right? So it really measures like nanometers. And things that are really small. Then here, on the b

ottom row, you can see the, the image where there is somebody holding a little whiteboard, this is actually an ECG machine. And this person actually discovered that, like, he writes a blog post, and I can find the reference later if somebody wants. But he actually finds that he has a heart arrhythmia, because he was able to play around with this ECG and brought it to his doctor and said, Look, this is my data. This is how my heart looks like over a 24 24 hour period. And there is something wrong

here. Let's fix it right. And it could have been that he would never like know about this. But I think it's interesting that people have the empowerment to like know what's going on with them. But on this note, there is also a project that is called Open insulin, where people are actually making easily monitors and trying to make their own insulin. Because like price surges of insulin in the us right now, for instance, a crazy, and a lot of people are really having bad times managing to get the

ir hands on the proper insulin that they need, and so on. And so this also empowers people to say, you know what, like, we are not going to take this nonsense anymore. And we're going to do it our own way. And there are many, many other projects, right, just a little bit of data on something that we're working on at the moment, which is, this is something that we're working with, just to show you that what we're seeing here is the number like the fraction of papers, as a percentage of total publ

ications from Pubmed, over time, from the 1990s to 2020 of papers that had either open source hardware, open hardware, or open lab were in their abstract title, or keywords. Right, so you can see that this is not growing. I mean, it's still a tiny, tiny fraction of the total number of papers, of course, but what I like here is how fast new new papers are coming more and more each year. If people are interested in this, I would really recommend taking a look at the GOSH community, which is the gl

obal Open Science hardware community, which is mainly where Juli, Alex and I met. But the point that I like to make here is that this is a really, really global online community most of the time, where people really from all continents are discussing open hardware, they're really, really open to newcomers and really willing to help with questions. And there is I think, the most important document that was done collectively, as you can see on the photo, on the bottom, right, like this was the vot

ing for things in the manifesto that we wrote, is like, what is the point of the global Open Science hardware community and what is our Manifesto, right. And you can see the points highlighted here. And this tries to make sure that this is, is in still keeps going as a really diverse and horizontal community when things are being discussed. And all of these different communities are being taken in consideration. Again, I can share the link for this, if I did notice here I have a slide in the end

with useful links, this might be there. A little bit of a shameless advert I'm sorry, so we are actually finishing the curriculum for a new programme. And you can pre register. So if you go to open hardware dot space, and you can also find more information about this. And the idea of this programme is to take people that are newcomers into open hardware and to show them best practices, right. So you're going to cover a lot of things that are quite similar in terms of like, Oh, this is GitHub or

this is Instructables. And these are all these platforms, but also, like we're gonna show like points on documentation and licences and things because hardware, believe it or not, most of hardware projects need at least three different licences, which is different from all the projects that we've been discussing right now. But yeah, so I just wanted to say, please take a look at the website, get in touch, if you want, like will be super, super happy to get projects, or even just questions from

you. Because we are in the moment where we're finalising our curriculum and we're ready to launch hopefully, in another couple of months, and then still run a cohort this year. With that, I'd like to say thanks, you can ask me questions. I think you're writing hopefully, in the document. You can find me on Twitter, you can send me an email as well be super happy to chat. Unknown: And just to say that here is a list of what I think are useful links. And the presentation is available also on the d

ocument. I hope I didn't go over time. But this is mostly what I had to share today. Malvika Sharan: Thank you so much, Andre. We have one question, but I'm just gonna announce that we're top of the hour if you have to drop off please feel free to drop off. But there is one single assignment which is new, which is about breaking down your milestones into smaller minds milestones. Besides that everything is a reputation from last cohort calls That we asked you to develop your documentation, try t

o launch your website whenever you're whenever you can. So back to Andre, for folks who are still sticking around. So first of all, there's a huge, there's a beautiful comment, which says, I love how open hardware can really unleash everyone's creativity. And every time this talk happens in our call, people see it for the first time. And it really, you know, it is it is something that we don't talk about enough. So I'm really glad that you could come and talk about it. We have one question which

says, I'm blown away by the mini microscope, can I ask if you came across something similar to for other hardware in the lab, such as plate readers, or thermal cycler? For PCR? Unknown: Absolutely. So one thing that I started doing when I got back into this path long time ago is if you have a Google for open source, and then the piece of equipment that you need, you most likely, and this is really, really cool, you're going to find somebody who has some sort of a project to do exactly what you

wanted to do already, right. And so for instance, there is a very famous PCR named open PCR, and these inspired other projects. And it's really much cheaper than than regular PCR machines. What I also wanted to say is that COVID has listed a lot of people working on real time PCR machines that are open source for detection, and testing, and so on. So there are a lot of these projects now available online as well. What was the other equipment? thermal ? Yeah. So all of these, I think they're avai

lable. It's a matter of looking for this. And on the links that I send, there is the open hardware Observatory, where you have a lot of projects that are there. They're actually Sorry, I'm going on and on, but they're actually like $5 PCR machines available, where you do like one PCR like what would be the equivalent of one eppendorf tube at a time. But still, like this is $5. Right? And so if you don't need to do like 96 plates, like 96 Well, plates, like a regular like this could be enough for

you. Right, Malvika Sharan: so I'm going to post the link that Andre just talked about is the project directory where different protocols for hardware there is I believe, yeah. Okay. Thanks so much, Andrew. That was wonderful. And thank you everybody for being here. Andrew, you're not yet in the slide, but I'll add you and I hope people can reach out to you if if needed. Yep. Wonderful. Thank you so much, everyone. Emmy and I are gonna stick around for five more minutes. So I'm going to stop re

cording but if you have any

OLS 3 Week 6 Open Science I: Project Development

Related articles

Comments