WHT

Tensorflow Seq2seq Tutorial | Seq2seq Tutorial (tensorflow)

The Math Student

Subscribe Here

Likes

250

Views

28,915

Seq2seq Tutorial (tensorflow)

Transcript:

Oh, hello, world. This is such a and in today’s video. We’re going to be looking at the basics of translation, so let’s get started in order to be doing translation. We’re going to be looking at this particular thing called sequences sequence and essentially what it is, is we take in a RNN sequence and then output it into another different RNN sequence, so for example. So what we have over here is we have a question-answering diagram, so we’re not doing this particular example, but this illustration, so it goes. Are you free tomorrow? And that’s the input encoder. Okay, which is one lsd’m step in this case and for the output you have, yes, what’s up and this is the output one, which is. You need to understand that this is to be a different. Rnn, okay, so I’ll assume whatever you want to call it so also notice that you had this input token call start, and then it’s good to say yes, but the yes is gonna in itself gonna be an input to the next lsd’m because keep in, mind each arm and LSD and whatever it is needs to have it have an input and it is going to spit out the output and then also it’s gonna pass on, it’s leaving state to the next thing. Okay, and it really needs an input and that’s why we’re gonna feed in the previous things input. Sorry, previously, the output as an input. Okay, so he goes. Yes, what’s up and then end? Alright, so let’s get start with the actual lesson itself, so unfortunately. I had to use tensorflow instead of Karas’s because it was this too much effort to write it in carrots to customize it. So let’s let’s move on one thing. The data that were going to be using in today’s lesson is dates, okay, so dates can be written in different formats, So the Americans have one way of writing in the bush, a different way and people people themselves have their own preferences. So let me show you an example of the date set of the data set that we’re gonna be using today. Okay, so so, basically, the first column is your. Xs and the second column is why I can’t see have in this case. July 7th, right, so, oh, you can also write that 30 in numbers and in July and then you can write to September and the two-step heart really isn’t relevant to the translation plan, right, because all we really care about is getting this 1971 Oh, 914 and and keep in mind. This is say a text over here, so we have an input text, and then we have an output text, right so we’re expecting the we’re going to be expecting the lsd’m to understand what the input is and then output a sequence. Now let me just go through how we go and process the data because again we need to make each factor into a number, right, so if you’ve watched my previous videos and I really recommend that you do before you get started on these things. We need to convert each character into a number. Alright, so, in this case, what? I’ve done over here as is gone up, Gordon. I found all the uni campus in X, and then I create a dictionary called Char Enum and the same way I do this same thing for for a while, because we really just have a limited range of characters if you’re really curious as to how the data was generated, it’s from this library called faker, which, which are dipping it all up here all the way up here, and then that’s how. I generate the data in all these different formats. Okay, and I’m generally fifty thousand instances and this is just a sample of all of them. Okay, all right, so we have our Xs and Ys the the next thing that we do is keep in mind that we we have to make our input and the output into a particular length, all right, so we need to have a set input length when it comes to X and the way that we do that is we had it, so I found out that the the biggest length in in all these factors was something like this where you could put in a day as well Tuesday, which was really not relevant to the output, so in that case, 29 was the biggest event that I found with all these comments and given that what I’m going to be what I’m going to be doing is I’m going to be adding it until the length is 29 Okay, so all the input axis have a length of 29 There’s ways of getting rounded, which I won’t go into this particular video, but again do that’s basically how you do it and the output. In this case, it’s the fixed length, so it’s always gonna be four four numbers for the years two and then two numbers for the month. Two numbers for the day. Okay, but at the same time. I’m going to put in stove character, and that’s really important and I talk about it soon. Okay, the best data function is face. It is video like taking the entire data set and then spitting out batches. Okay, sighs, yeah. Okay, so this is the chance of the graph. Now let me come back to my diagram before. I actually jump into it, so you hopefully understand this better. So what’s happening over here is we had this 29 length input input characters, but what we’re going to be doing is the last feeding state and the last hidden memory we’re going to be taking that, and then putting that as the initial state into this blue set of Ireland’s so again, the last state and the last in our memory is drinking green tea this, but this is for thought like over here and we’re gonna put this as the initial State for the initial hit mistake and initial memory state in to be the next Alice TM. Okay, so that is, that’s all that’s pretty much happening, all right and again. The star thing is what? I called dog and I’m me. I do need that because I need the first value, all right, and I can set the first value to be toe. In this case, okay, so let’s look at our show graph the ten’s flow graph. Alright, so this is where all the modeling is happening and so over here, it’s really important that you go to your interactive session and not ta plah session, so whenever you paint chip in their boots, make sure you put interactive because you can play around with it, so otherwise weird things happen and get, so let’s avoid all of that, so the next so this block over here inputs output targets, so I’ll talk about what the output, what’s different between the outputs and the targets, but the inputs is just basically what I just said the input X was so there’s no mystery over there. The targets is going to be every all the cactus of the output, except except for the goat character, all right, oh, and and outputs is starting from the goat character and again it’s to do with Y, but again, let’s let’s come back. There might be a bit confusing. So this is this is what we, what is an embedding. All right so so we initialize it to be a random uniform between minus 1 and 1 and we have to go. Tf dog variable, all right, so any anything that we’re gonna be. This can be changing the state variable and notice all the XS and the Ys have theater placeholder. Okay, because we’re gonna be feeding in the data or there and it needs a placeholder, also important notice is how I put in none and the next sequence length. All right, So the none mean basically means it can be a variable variable length so but usually the first number is the back size, and then whatever it has to be, so the outputs in this case is going to be back sighs by the Y sequence length. But I put that to be variable and again. I’ll talk about why I put that to be variable, and I fix the input length to be to be fixed. Okay, so I talked about embeddings and then you, you need to do an embedding lookup. Okay, because keep in mind, mind all that embedding is, it’s just a big lookup table, all right, so this is really nothing, anything special about an embedding, so you look up, Look it up from from all the inputs. So from that 29:29 length of excellence, you’re gonna be looking up whatever whatever character that comes in and then, and then what we’re going to be doing. Is this input embedding fitting it into a lsdm? Okay, so these are the other schemes over here well. I should call another since it’s RN and really so the basic The basic unit is going to be lsd’m stuff and the sequence. So the input sequence is going to be a dynamic. Ireland, okay, So all that dynamic Arnon means is that we’re going to be unrolling it. Oh, brother, like appending those, those LS gems into a length of 29 Okay, so that’s yes, so the inputs, and and basically it takes care of it. We don’t need to specify that it’s 29 in then because this input date input embed will take care of that for us. Okay, which is, which is really what’s cool about the dynamic. Ireland was it’s the Static Ireland this as soon as I still static, It’s gonna be staying fixed to a length of 29 which we don’t want when it comes to the output, all right. I’ll come back to that, So the decoding layers, similar kind of thing and just notice how I from the Dynamic Ireland, the last state becomes the input it becomes the initial state of the decoder. Alright, so over here. I have my encoder my decoder. I takes the last day of the encoder and Chuck it in there. So yeah, so over here. We don’t care about the last state, but we do care about all the outputs. Okay, so come back to the diagram, so come back to the diagram over here for the input, all all. I all I really wanted was. Was these two things? But when it comes with decoder? I really want these arrows pointing up. Okay, so that’s. What dick outputs is down here, so lets. Come back! Yeah, okay so. Dec. Outputs is those arrows pointing up And then we’re gonna we’re gonna take that, and we’re we’re gonna fit it into your fully connected. So if you give them any? What character is the same as saying tense, All right, Yes, so those those become here. It’s called Logits, but the that’s what’s going to be fed into my loss function. Okay, so tensorflow luckily has is sweetness sequence loss function that’s been written, but essentially all it is is softmax function. That’s that’s been applied to each of each and every sequence each and each and every element along the sequence. Okay, so that’s all it is, and then we have a targets. Don’t worry too much about this the. Tfl one standing, imagine surprised that it’s even in there it should. It should really be hidden from the from the user and yet, so we have our optimizer. Okay, so hopefully you’re starting to see some similarities with care as if if you use it before. Now, if you’re really what trying to understand what the graph looks like is, it’s really useful thing to look at the shape of each of these stencils, right, so any any of the variables that I’ve mentioned over here? I can go don’t. Get shake and then go as list. Okay, now the last state keep in mind, it’s spitting out the hidden state and the cell membrane. Okay, so there’s two things that’s coming out, so you need to specify which one you want, which is 0 1 because it’s a tuple that comes out and then you need to go duck that shape and then as list and so here. I checked out some of the other ones, so date input embed so so the sequence of 29 But if I remember correctly, the embedding the embedding sighs, I said – yeah, so. I’ve set to 10 over here, so you have a variable. Yeah, so this is embed sighs. Okay, table! Okay, oh, and this show graph function is something. I I took from stack overflow, right, so I put the reference link up there, but it’s really, it’s really cool because you can check out what the graph looks like on Youtube. Your notebook, sort of me. Let me zoom out over here, all right, so you have your you have the encoder. And because I wish it had switched sides around. But anyway, it is what it is, so you have your embedding the embedding inputs. They you can see that it looks up or whatever it’s supposed to be, and depending on what your inputs are, but it still is look at the arrow still and make coding here you can the good thing is if you really want to see what the graph looks like you can expand it and you can see that there’s a RN end going into that and this one because it shows LST. Em, you know, quite say over here, but anyway, so you can, in the same way you can look at the decoder and then you have your fully connected state. So this is what the diagram looks like. Okay, alright, so you can play around with that yourself. So now let’s start looking at the train. Alright, so in training. I’m Smitha. Explaining it to me. The test set the rest of it to be the train set and keep in mind. The extra is your input sequence. My train is your output sequence. OK, so when we’re training so yes, so keep in mind. We had input three things, so we need to put in inputs outputs and the targets, so the inputs should be fairly obvious. OK, so it’s basically this or this. I call it sauce match and then the target batch now. But so the actual targets is going to be everything, but we go simple, right, so that’s what it’s going to be and outputs. So what I’ve what I’d call Outputs is really is really inputs into the decoder decoding LSDM. OK, so that’s why. I’ve said everything except for the last character. OK, so so again, the the targets and moonstones are everything, but they go Kappa, and then the outputs is everything but the last character. OK, so let’s go back to the diagram, OK? So if you look at over here, so the inputs so see how this inputs into the decoder, right, so in this case will be start and then the next week will be. Yes, what’s up, okay, So everything. Everything, including including code will be booked, will be the inputs into into the decoding layer and then the outputs will be. Yes, what’s up and end, so in this case? I didn’t use end as a as a final thing, a publisher, but yeah, so it’s basically everything offset by one right, so in this case, the characters offset by one, it’s going to be the final targets and outputs is go and then one calculus. Okay, so this is only happening during training. All right, so we’re not going to be using that load trick during testing so once. I’ve trained this. Yeah, you can see that. The accuracy starts off 55% which is which is actually quite good because keep in mind, we’re trying to predict nothing. There’s 22 something possible characters, right, so it’s fifty fifty-five percent is actually really good when when concerned about that. I mean, from twenty characters in Scania accuracy, 55 percent. Okay, it’s not like flipping a coin. And it stated fifty five percent. It’s nice this 2000 classes, so it goes up to ninety percent, and if I look at the test sent in the test set, the accuracy is 88 percent. Okay, so so that’s all nice, but we’re not we’re not done yet. What we what we need to do is given an input, right, So if I say 2015 November 2008 How do I predict it in in a real scenario? I can so to do that so in order to do that. This is what we do so. I know what the sequence of Y is. Luckily, in this case, alright, so it is a 10 or 11 So what we’re going to do? Is we’re going to be putting a wig in this case. I’ve taken, I think I’ve taken two things. So what I’ve taken one particular day, right, so the sauce patch over here is fixed, right, but the decoder input is, it’s gonna be changing, all right, so the decoder input at in the beginning is simply just a charging on Y of code. Right, so I think yeah, so it’s just gonna be go in this particular case, and then what what we do is we take the prediction to be the maximum value of the lodges, All right, so we want the maximum probability value to be my prediction, All right, and then I take the prediction and then I stuck it on to my decoding input, all right, so it said so get this, so you really need to understand this for loop of what’s happening on the full of so again. I have a fixed input. Okay, which is, which is basically the date that I want to be converted and the output starts up starts off as this the go character, and then we start predicting and whatever we predict, we append it on to our decoder input, and then we loop through it again, all right, so up to a length of Y sequence length. So once we do that we get the dates that we want, all right, so in so doing that is how? I got 88 is an accuracy, and this is how it’s done conversion, right, so it’s taken this input and character by character is generated all these things. All right, so just notice it. We doesn’t care how Octobe is written with a capital O or if Octave wasn’t written in the middle. The final dot Vector understands, so it summarizes what that date is and then starts outputting the output sequence it so hopefully that made sense. If you have any questions or comments, please do ask down below. Subscribe, make sure you subscribe to me on deep school that iOS start their post rate and thanks for watching.

0.3.0 | Wor Build 0.3.0 Installation Guide

Transcript: [MUSIC] Okay, so in this video? I want to take a look at the new windows on Raspberry Pi build 0.3.0 and this is the latest version. It's just been released today and this version you have to build by yourself. You have to get your own whim, and then you...

read more