1 00:00:00,539 --> 00:00:03,539 foreign 2 00:00:10,080 --> 00:00:14,280 they will be able to see us now in that 3 00:00:12,599 --> 00:00:16,260 question time through the camera so wave 4 00:00:14,280 --> 00:00:18,359 hi 5 00:00:16,260 --> 00:00:22,100 hello 6 00:00:18,359 --> 00:00:22,100 are we all ready to go over there 7 00:00:23,580 --> 00:00:30,119 yeah I I'm all set excellent 8 00:00:27,779 --> 00:00:34,640 how about it 9 00:00:30,119 --> 00:00:37,800 thank you uh hello everyone my name is 10 00:00:34,640 --> 00:00:39,719 and I'll be talking to today about n 11 00:00:37,800 --> 00:00:43,260 plus one problem in Django rest 12 00:00:39,719 --> 00:00:46,020 framework here at pycon AU very excited 13 00:00:43,260 --> 00:00:49,500 to be here even though remotely but 14 00:00:46,020 --> 00:00:51,120 since my first time at baikon EU so 15 00:00:49,500 --> 00:00:52,800 pretty much excited 16 00:00:51,120 --> 00:00:56,460 uh 17 00:00:52,800 --> 00:00:58,379 so our agenda today is like just a brief 18 00:00:56,460 --> 00:01:00,180 of what is Dengue rest framework and the 19 00:00:58,379 --> 00:01:03,420 rest of the talk will focus on like what 20 00:01:00,180 --> 00:01:06,000 is n plus one problem how it can happen 21 00:01:03,420 --> 00:01:08,760 within Django rest framework how do we 22 00:01:06,000 --> 00:01:13,159 identify these problems uh in our apis 23 00:01:08,760 --> 00:01:17,220 and how do we resolve these uh problems 24 00:01:13,159 --> 00:01:17,939 uh before I jump into the content who am 25 00:01:17,220 --> 00:01:19,939 I 26 00:01:17,939 --> 00:01:23,040 again myself 27 00:01:19,939 --> 00:01:25,920 Ali I am a software engineer by 28 00:01:23,040 --> 00:01:29,040 profession and currently I am working as 29 00:01:25,920 --> 00:01:30,299 a contractor with rbsoft at Lahore 30 00:01:29,040 --> 00:01:33,299 Pakistan 31 00:01:30,299 --> 00:01:36,900 so ibisoft is mainly mainly a service 32 00:01:33,299 --> 00:01:41,159 provider company and we have a diverse 33 00:01:36,900 --> 00:01:43,320 set of technological skills uh primarily 34 00:01:41,159 --> 00:01:45,740 I've been working with python and Django 35 00:01:43,320 --> 00:01:48,720 for last couple of years 36 00:01:45,740 --> 00:01:51,479 as part of rbsoft Team I've been working 37 00:01:48,720 --> 00:01:52,799 with edx Slash to you as rbsoft 38 00:01:51,479 --> 00:01:54,899 contractor 39 00:01:52,799 --> 00:01:56,939 apart from that I have been involved 40 00:01:54,899 --> 00:01:58,799 with a few internal products development 41 00:01:56,939 --> 00:02:00,780 within RB soft 42 00:01:58,799 --> 00:02:03,299 if you want to get in touch with me over 43 00:02:00,780 --> 00:02:06,060 LinkedIn read something that I have 44 00:02:03,299 --> 00:02:08,819 written on medium or see my GitHub the 45 00:02:06,060 --> 00:02:10,259 links are here I'll share the slides 46 00:02:08,819 --> 00:02:11,340 slice Link at the end of the 47 00:02:10,259 --> 00:02:14,400 presentation 48 00:02:11,340 --> 00:02:17,340 uh with that out of the way let's jump 49 00:02:14,400 --> 00:02:19,020 into the content starting it it off with 50 00:02:17,340 --> 00:02:20,640 jungler's framework 51 00:02:19,020 --> 00:02:24,900 so 52 00:02:20,640 --> 00:02:27,720 if any of you work with Django over like 53 00:02:24,900 --> 00:02:29,819 over the last decade it's highly likely 54 00:02:27,720 --> 00:02:32,819 that you got to interact with djangora's 55 00:02:29,819 --> 00:02:35,580 framework in one way or another uh like 56 00:02:32,819 --> 00:02:39,599 with the Advent of cloud and single page 57 00:02:35,580 --> 00:02:43,019 applications uh we have seen uh arriving 58 00:02:39,599 --> 00:02:45,540 popularity of Django and apis develop on 59 00:02:43,019 --> 00:02:46,379 Django back-ends using dangote's 60 00:02:45,540 --> 00:02:48,720 framework 61 00:02:46,379 --> 00:02:51,239 uh what's beautiful about jaguars 62 00:02:48,720 --> 00:02:53,280 framework is that uh 63 00:02:51,239 --> 00:02:55,140 it is built upon the core Django 64 00:02:53,280 --> 00:02:56,220 features and it integrates really well 65 00:02:55,140 --> 00:02:59,220 with them 66 00:02:56,220 --> 00:03:01,860 so for instance how we expose our apis 67 00:02:59,220 --> 00:03:05,840 using URLs and Views it's all built upon 68 00:03:01,860 --> 00:03:08,519 the Django's views and URLs uh framework 69 00:03:05,840 --> 00:03:11,459 using Django models we can easily create 70 00:03:08,519 --> 00:03:13,620 serializers so how see laser or like how 71 00:03:11,459 --> 00:03:14,780 we represent data in Json and other 72 00:03:13,620 --> 00:03:17,599 format 73 00:03:14,780 --> 00:03:19,739 drf also take care of tcls 74 00:03:17,599 --> 00:03:22,260 deserialization so in a sense if we are 75 00:03:19,739 --> 00:03:24,659 getting data in form of Json or some 76 00:03:22,260 --> 00:03:28,319 other formats uh it does the job of 77 00:03:24,659 --> 00:03:30,900 converting that Json into appropriate 78 00:03:28,319 --> 00:03:34,260 model objects 79 00:03:30,900 --> 00:03:36,780 mainly drf is used for API development 80 00:03:34,260 --> 00:03:39,659 and for that it provides a variety of 81 00:03:36,780 --> 00:03:43,200 options in how we design our views 82 00:03:39,659 --> 00:03:45,420 if we like uh want to write a complete 83 00:03:43,200 --> 00:03:46,500 API from scratch we can utilize API 84 00:03:45,420 --> 00:03:48,900 views 85 00:03:46,500 --> 00:03:51,599 but if we have serializers we can use 86 00:03:48,900 --> 00:03:54,480 view sets and genetics to set up an API 87 00:03:51,599 --> 00:03:57,239 within like three to four lines 88 00:03:54,480 --> 00:03:59,700 and lastly using custom authentication 89 00:03:57,239 --> 00:04:03,060 and permission we can easily add layers 90 00:03:59,700 --> 00:04:05,640 of controls on our apis so in that 91 00:04:03,060 --> 00:04:07,799 regard drf make thing makes things 92 00:04:05,640 --> 00:04:10,760 really easy in how we can design apis 93 00:04:07,799 --> 00:04:10,760 using Django 94 00:04:11,760 --> 00:04:16,139 look moving forward like what is n plus 95 00:04:14,220 --> 00:04:18,540 one problem and how does relational 96 00:04:16,139 --> 00:04:23,100 database come into the play here 97 00:04:18,540 --> 00:04:27,180 so in typical rdbms like my SQL or 98 00:04:23,100 --> 00:04:29,580 postgres uh how we represent real life 99 00:04:27,180 --> 00:04:31,680 objects is in the form of tables so our 100 00:04:29,580 --> 00:04:35,040 data model is scattered across different 101 00:04:31,680 --> 00:04:37,139 tables and the tables are connected are 102 00:04:35,040 --> 00:04:39,660 related to each other via some sort of 103 00:04:37,139 --> 00:04:41,759 relationship beat one one one to one one 104 00:04:39,660 --> 00:04:44,880 to many or many to many 105 00:04:41,759 --> 00:04:47,460 and we join those tables to get the 106 00:04:44,880 --> 00:04:51,720 complete information of an entity 107 00:04:47,460 --> 00:04:54,060 so let's say for this example we have a 108 00:04:51,720 --> 00:04:55,860 customer that can order a bunch of 109 00:04:54,060 --> 00:04:58,139 things and within an order there can be 110 00:04:55,860 --> 00:05:01,259 multiple products so if we want to get 111 00:04:58,139 --> 00:05:03,360 the uh entire order information of a 112 00:05:01,259 --> 00:05:04,500 customer we can do that in two different 113 00:05:03,360 --> 00:05:06,540 ways 114 00:05:04,500 --> 00:05:09,300 first way is that we get customer 115 00:05:06,540 --> 00:05:12,000 information from the customer table 116 00:05:09,300 --> 00:05:13,740 based on the customer ID we can get all 117 00:05:12,000 --> 00:05:16,380 the orders made by that customers and 118 00:05:13,740 --> 00:05:18,419 then based upon order IDs we can get all 119 00:05:16,380 --> 00:05:21,419 the products 120 00:05:18,419 --> 00:05:23,759 the second way of doing that is simply 121 00:05:21,419 --> 00:05:26,220 join the tables using for foreign keys 122 00:05:23,759 --> 00:05:28,440 and then get all the required data in 123 00:05:26,220 --> 00:05:32,100 one or two queries 124 00:05:28,440 --> 00:05:34,320 so these are two different ways of 125 00:05:32,100 --> 00:05:37,320 achieving the same thing 126 00:05:34,320 --> 00:05:40,139 in first method we are making one query 127 00:05:37,320 --> 00:05:42,300 to get the main object and then for each 128 00:05:40,139 --> 00:05:44,580 related object we are making a separate 129 00:05:42,300 --> 00:05:46,979 query so one for one query for main 130 00:05:44,580 --> 00:05:49,080 objects and then for each of the related 131 00:05:46,979 --> 00:05:51,300 object we are making uh separate queries 132 00:05:49,080 --> 00:05:52,680 so it's one plus NQ result n plus one 133 00:05:51,300 --> 00:05:56,460 queries 134 00:05:52,680 --> 00:05:58,139 the second is we are just joining uh 135 00:05:56,460 --> 00:05:59,340 the tables and getting all the 136 00:05:58,139 --> 00:06:01,680 information 137 00:05:59,340 --> 00:06:03,539 so the question here might be okay both 138 00:06:01,680 --> 00:06:05,580 methods are like doing the same things 139 00:06:03,539 --> 00:06:10,080 in a different way so why is n plus 1 140 00:06:05,580 --> 00:06:13,199 and issue when it comes to rdbms 141 00:06:10,080 --> 00:06:16,919 it's not really an issue on the DB level 142 00:06:13,199 --> 00:06:19,259 but when we involve our orms or object 143 00:06:16,919 --> 00:06:20,880 relational mapping that's where n plus 1 144 00:06:19,259 --> 00:06:25,199 becomes a problem 145 00:06:20,880 --> 00:06:28,979 uh so Django is an orm SQL Alchemy is an 146 00:06:25,199 --> 00:06:31,680 orm and with orm we can uh 147 00:06:28,979 --> 00:06:35,759 do a bit of uh abstraction and how we 148 00:06:31,680 --> 00:06:38,280 design our database so we Define our 149 00:06:35,759 --> 00:06:41,819 models using orms and then orms take 150 00:06:38,280 --> 00:06:43,880 care of like generating the appropriate 151 00:06:41,819 --> 00:06:46,620 table adding the appropriate 152 00:06:43,880 --> 00:06:49,500 relationships when we want to fetch 153 00:06:46,620 --> 00:06:52,199 something Oram takes care of all of that 154 00:06:49,500 --> 00:06:55,259 so all of the heavy load of connecting 155 00:06:52,199 --> 00:06:58,620 with uh uh like database server getting 156 00:06:55,259 --> 00:07:01,319 the information that is handled by orm 157 00:06:58,620 --> 00:07:04,919 why is n plus one problem in here is 158 00:07:01,319 --> 00:07:06,300 that uh Oram has to make a DB round trip 159 00:07:04,919 --> 00:07:08,699 for each query 160 00:07:06,300 --> 00:07:12,539 so let's say if uh for the previous 161 00:07:08,699 --> 00:07:15,600 example if a customer has 100 orders so 162 00:07:12,539 --> 00:07:18,600 orm might be making 100 plus queries to 163 00:07:15,600 --> 00:07:20,039 get to get the information for One 164 00:07:18,600 --> 00:07:23,099 customer 165 00:07:20,039 --> 00:07:24,960 the individual queries might be fast but 166 00:07:23,099 --> 00:07:27,180 it's the combination of all the security 167 00:07:24,960 --> 00:07:29,460 that becomes problematic 168 00:07:27,180 --> 00:07:31,979 so essentially the more queries are 169 00:07:29,460 --> 00:07:34,919 there in the fetch operation 170 00:07:31,979 --> 00:07:37,259 the more round trips to the database 171 00:07:34,919 --> 00:07:39,720 will be made by orm and the more round 172 00:07:37,259 --> 00:07:41,699 trips are there the more load we'll be 173 00:07:39,720 --> 00:07:44,039 putting on our database servers and 174 00:07:41,699 --> 00:07:47,880 causing performance issues 175 00:07:44,039 --> 00:07:52,080 so because of how olm works that's why 176 00:07:47,880 --> 00:07:54,120 the N plus one becomes a performance uh 177 00:07:52,080 --> 00:07:56,840 like performance complication 178 00:07:54,120 --> 00:07:56,840 performance problem 179 00:07:57,300 --> 00:08:02,460 okay uh 180 00:07:59,699 --> 00:08:07,259 moving forward I'll be using this 181 00:08:02,460 --> 00:08:10,259 example for like all the explain all of 182 00:08:07,259 --> 00:08:12,539 the things in drf 183 00:08:10,259 --> 00:08:15,479 on online resources the author and book 184 00:08:12,539 --> 00:08:17,940 example is used a lot to explain the N 185 00:08:15,479 --> 00:08:21,300 plus one problem and that is why I'm 186 00:08:17,940 --> 00:08:24,360 using that example here as well so in 187 00:08:21,300 --> 00:08:27,240 this we have an author uh one to one 188 00:08:24,360 --> 00:08:30,120 with Django user uh 189 00:08:27,240 --> 00:08:33,719 author can write multiple books book can 190 00:08:30,120 --> 00:08:36,539 be of a category and we can add tags to 191 00:08:33,719 --> 00:08:37,979 books like New York best time seller 192 00:08:36,539 --> 00:08:39,959 number one seller 193 00:08:37,979 --> 00:08:44,520 Etc 194 00:08:39,959 --> 00:08:47,279 uh so it's like really simple uh 195 00:08:44,520 --> 00:08:49,380 model nothing complicated 196 00:08:47,279 --> 00:08:50,940 the idea here was to show different type 197 00:08:49,380 --> 00:08:53,100 of relationship that we can have and see 198 00:08:50,940 --> 00:08:56,100 how n plus one impacts them so on author 199 00:08:53,100 --> 00:08:59,100 we have a one-to-one with user manager 200 00:08:56,100 --> 00:09:02,220 money with specializations on book 201 00:08:59,100 --> 00:09:04,380 we have a foreign key to author foreign 202 00:09:02,220 --> 00:09:06,000 key with category and manage money with 203 00:09:04,380 --> 00:09:07,860 uh tags 204 00:09:06,000 --> 00:09:11,279 and the API is two apis that will look 205 00:09:07,860 --> 00:09:12,720 is listing all authors with their books 206 00:09:11,279 --> 00:09:14,940 and 207 00:09:12,720 --> 00:09:16,620 list all the books including the person 208 00:09:14,940 --> 00:09:19,500 who authored it so it's like we are 209 00:09:16,620 --> 00:09:23,120 doing similar things but there will be 210 00:09:19,500 --> 00:09:24,660 operational differences 211 00:09:23,120 --> 00:09:28,080 uh 212 00:09:24,660 --> 00:09:31,019 it's just a screenshot of this apis in 213 00:09:28,080 --> 00:09:33,300 action so on the left side is the 214 00:09:31,019 --> 00:09:34,200 authors listing where we are listing the 215 00:09:33,300 --> 00:09:36,120 authors 216 00:09:34,200 --> 00:09:37,019 details and the books that they have 217 00:09:36,120 --> 00:09:38,880 written 218 00:09:37,019 --> 00:09:40,920 on the right side is the book listing 219 00:09:38,880 --> 00:09:43,560 where we are adding the details of books 220 00:09:40,920 --> 00:09:46,440 including the author information 221 00:09:43,560 --> 00:09:48,740 but here we won't be having the books 222 00:09:46,440 --> 00:09:50,540 info information again so it's already 223 00:09:48,740 --> 00:09:54,540 up there 224 00:09:50,540 --> 00:09:57,839 why I'm using listing because in listing 225 00:09:54,540 --> 00:10:00,360 we are fetching objects in bulb and it's 226 00:09:57,839 --> 00:10:02,940 really easy to demonstrate what's going 227 00:10:00,360 --> 00:10:04,980 on behind the scene with listing with 228 00:10:02,940 --> 00:10:06,779 retrieval identifying n plus one one 229 00:10:04,980 --> 00:10:08,399 problem might be tricky but with listing 230 00:10:06,779 --> 00:10:11,000 it's really easy to see what's going on 231 00:10:08,399 --> 00:10:11,000 under the hood 232 00:10:12,600 --> 00:10:19,140 so let's start off with like how n plus 233 00:10:15,360 --> 00:10:21,660 one can happen with drf views 234 00:10:19,140 --> 00:10:24,060 uh 235 00:10:21,660 --> 00:10:26,760 in here we'll be more focused on the 236 00:10:24,060 --> 00:10:28,019 generic and models view sets that drf 237 00:10:26,760 --> 00:10:32,459 provides 238 00:10:28,019 --> 00:10:34,620 in both of these uh options we require a 239 00:10:32,459 --> 00:10:35,880 query set to return our necessary 240 00:10:34,620 --> 00:10:38,040 objects 241 00:10:35,880 --> 00:10:40,140 security sites in Django terms is just a 242 00:10:38,040 --> 00:10:42,360 collection of DB objects it can be zero 243 00:10:40,140 --> 00:10:46,320 one or many 244 00:10:42,360 --> 00:10:48,360 we can either set a value for Q a set 245 00:10:46,320 --> 00:10:50,399 attribute or override get query set 246 00:10:48,360 --> 00:10:52,680 method 247 00:10:50,399 --> 00:10:55,680 so it's that query set that defines how 248 00:10:52,680 --> 00:10:57,180 or which model will be used to get the 249 00:10:55,680 --> 00:10:59,519 objects from 250 00:10:57,180 --> 00:11:02,240 so let's say for both book and auto 251 00:10:59,519 --> 00:11:05,160 listing we are just adding 252 00:11:02,240 --> 00:11:08,880 book.objects.org or author.objects.all 253 00:11:05,160 --> 00:11:10,019 and it takes care of listing all of the 254 00:11:08,880 --> 00:11:11,700 entities 255 00:11:10,019 --> 00:11:13,920 so you can see like it's very easy to 256 00:11:11,700 --> 00:11:17,040 open within three or four lines we have 257 00:11:13,920 --> 00:11:20,700 set up an API view but there are things 258 00:11:17,040 --> 00:11:23,459 happening uh at the back 259 00:11:20,700 --> 00:11:27,360 now the problem with query sets is that 260 00:11:23,459 --> 00:11:30,240 they are lazy like lazy loaded uh DB 261 00:11:27,360 --> 00:11:33,180 won't be hit until the object is really 262 00:11:30,240 --> 00:11:34,440 needed so it is not pre-fetched the 263 00:11:33,180 --> 00:11:36,420 information 264 00:11:34,440 --> 00:11:39,720 once we require that information that's 265 00:11:36,420 --> 00:11:41,880 when DB will be hit together information 266 00:11:39,720 --> 00:11:44,700 so it goes through for the main objects 267 00:11:41,880 --> 00:11:46,560 the related object 268 00:11:44,700 --> 00:11:49,800 on the main object 269 00:11:46,560 --> 00:11:53,459 and the related object objects on the 270 00:11:49,800 --> 00:11:57,120 related objects so for every case uh 271 00:11:53,459 --> 00:11:59,100 unless the object is not uh needed it 272 00:11:57,120 --> 00:12:01,800 won't be retrieved 273 00:11:59,100 --> 00:12:04,440 so this query set behavior is something 274 00:12:01,800 --> 00:12:06,200 that saves into like drf views and cause 275 00:12:04,440 --> 00:12:09,959 problem complications 276 00:12:06,200 --> 00:12:13,980 so let's say uh if we take this example 277 00:12:09,959 --> 00:12:16,920 book Dot objects.org uh 278 00:12:13,980 --> 00:12:18,839 if we want to print out the book 279 00:12:16,920 --> 00:12:20,880 author's username 280 00:12:18,839 --> 00:12:24,120 for each book we'll be hitting the 281 00:12:20,880 --> 00:12:26,640 database because the query set would not 282 00:12:24,120 --> 00:12:28,560 have prefetched the author information 283 00:12:26,640 --> 00:12:29,579 so let's say if we have 100 books we'll 284 00:12:28,560 --> 00:12:32,459 be 285 00:12:29,579 --> 00:12:35,040 hitting at least 100 times 286 00:12:32,459 --> 00:12:36,240 to get the other information for each 287 00:12:35,040 --> 00:12:38,600 book 288 00:12:36,240 --> 00:12:41,459 so like that's a 289 00:12:38,600 --> 00:12:45,200 problem with curioset but that impacts 290 00:12:41,459 --> 00:12:48,300 how drfus are are written 291 00:12:45,200 --> 00:12:50,220 similarly uh on the serializer level uh 292 00:12:48,300 --> 00:12:51,959 serializer and like how we diff we 293 00:12:50,220 --> 00:12:54,360 Define how the objects are or its 294 00:12:51,959 --> 00:12:58,079 related objects will be represented 295 00:12:54,360 --> 00:13:01,079 on serializer uh we can also use Query 296 00:12:58,079 --> 00:13:04,019 search for certain type of fields 297 00:13:01,079 --> 00:13:07,680 uh so let's say uh if I'm getting the 298 00:13:04,019 --> 00:13:09,540 tags information I can uh Define the 299 00:13:07,680 --> 00:13:12,120 query set from where this information 300 00:13:09,540 --> 00:13:14,040 will be fetched and this is like 301 00:13:12,120 --> 00:13:15,839 specific to certain Fields where we can 302 00:13:14,040 --> 00:13:19,139 use Query set 303 00:13:15,839 --> 00:13:22,200 or Source attribute 304 00:13:19,139 --> 00:13:23,940 so Source attribute is where we Define 305 00:13:22,200 --> 00:13:24,899 which attribute to read the information 306 00:13:23,940 --> 00:13:26,639 from 307 00:13:24,899 --> 00:13:30,000 so let's say I'm I'm getting the author 308 00:13:26,639 --> 00:13:30,920 information and we need the email uh I 309 00:13:30,000 --> 00:13:33,240 can 310 00:13:30,920 --> 00:13:35,339 tell that okay read email from 311 00:13:33,240 --> 00:13:37,740 user.email 312 00:13:35,339 --> 00:13:40,620 and ship same is the case for username 313 00:13:37,740 --> 00:13:42,779 now for both of these uh 314 00:13:40,620 --> 00:13:46,620 operations 315 00:13:42,779 --> 00:13:50,220 if these have not been pre-fetched these 316 00:13:46,620 --> 00:13:52,079 calls will result in additional DPA 317 00:13:50,220 --> 00:13:54,839 of course 318 00:13:52,079 --> 00:13:57,360 so if we are listing there will be like 319 00:13:54,839 --> 00:14:00,420 n plus one on the query set but and also 320 00:13:57,360 --> 00:14:02,160 n plus one on DC laser level as well so 321 00:14:00,420 --> 00:14:03,000 unless we have prefetched the user 322 00:14:02,160 --> 00:14:06,060 information 323 00:14:03,000 --> 00:14:08,040 using such uh configuration can result 324 00:14:06,060 --> 00:14:09,660 in additional DB calls when we are 325 00:14:08,040 --> 00:14:11,940 fetching information 326 00:14:09,660 --> 00:14:14,100 so like the fields that take source and 327 00:14:11,940 --> 00:14:17,880 query set attributes are prone to n plus 328 00:14:14,100 --> 00:14:19,860 one uh issues so if like again the 329 00:14:17,880 --> 00:14:22,200 object the iterated object is not 330 00:14:19,860 --> 00:14:25,440 already prefetched it will 331 00:14:22,200 --> 00:14:27,540 cause additional database calls 332 00:14:25,440 --> 00:14:29,700 and again the behavior is that because 333 00:14:27,540 --> 00:14:32,880 curious ads are lazy they do not fetch 334 00:14:29,700 --> 00:14:35,540 the related object unless we need those 335 00:14:32,880 --> 00:14:35,540 objects 336 00:14:36,000 --> 00:14:42,000 so when we are writing uh apis be it in 337 00:14:39,839 --> 00:14:44,880 Django or prf 338 00:14:42,000 --> 00:14:47,220 how can we know that 339 00:14:44,880 --> 00:14:51,240 the code that we are writing has n Plus 340 00:14:47,220 --> 00:14:55,680 One problems so like what uh again uh 341 00:14:51,240 --> 00:14:58,740 I'm using listing to uh to Showcase this 342 00:14:55,680 --> 00:15:00,600 on local we might not have that much of 343 00:14:58,740 --> 00:15:04,139 data to see what might be going wrong 344 00:15:00,600 --> 00:15:06,480 here so for to identify and plus one 345 00:15:04,139 --> 00:15:10,199 problems we have a few tools that we can 346 00:15:06,480 --> 00:15:12,660 use uh one is n plus one the second is 347 00:15:10,199 --> 00:15:14,880 Django debug toolbars so both of these 348 00:15:12,660 --> 00:15:18,240 are the packages that can be easily uh 349 00:15:14,880 --> 00:15:21,060 configured with our Django applications 350 00:15:18,240 --> 00:15:22,980 n plus one is a package specific to 351 00:15:21,060 --> 00:15:25,740 detect n plus one problem so it's it 352 00:15:22,980 --> 00:15:27,779 like analyzes the application logs and 353 00:15:25,740 --> 00:15:29,339 logs the potential and plus one issues 354 00:15:27,779 --> 00:15:30,600 that might be happening within the 355 00:15:29,339 --> 00:15:32,579 application 356 00:15:30,600 --> 00:15:34,800 it does not tell where that issue is 357 00:15:32,579 --> 00:15:36,240 coming from it just logs out that on 358 00:15:34,800 --> 00:15:37,019 certain Fields there might be n plus one 359 00:15:36,240 --> 00:15:38,760 problem 360 00:15:37,019 --> 00:15:43,079 uh this package is not actively 361 00:15:38,760 --> 00:15:47,220 maintained but uh but it still works so 362 00:15:43,079 --> 00:15:50,339 that's plus one about it so 363 00:15:47,220 --> 00:15:53,339 using the same package on both our books 364 00:15:50,339 --> 00:15:55,740 and author listing uh endpoint 365 00:15:53,339 --> 00:15:58,560 it is just tell that there is n plus one 366 00:15:55,740 --> 00:16:00,540 query on book dot tax set or type field 367 00:15:58,560 --> 00:16:03,000 other fields same for the author dot 368 00:16:00,540 --> 00:16:06,180 user Fields author.books fail 369 00:16:03,000 --> 00:16:09,000 it does not tell where this n plus one 370 00:16:06,180 --> 00:16:13,019 is originating from 371 00:16:09,000 --> 00:16:15,420 so that's why it it is better to test 372 00:16:13,019 --> 00:16:17,820 each API in isolation 373 00:16:15,420 --> 00:16:20,100 so take one API 374 00:16:17,820 --> 00:16:22,860 see how n plus 1 is responding against 375 00:16:20,100 --> 00:16:26,660 that and try to figure out where the N 376 00:16:22,860 --> 00:16:26,660 plus one might be coming from 377 00:16:27,600 --> 00:16:32,220 Django debug toolbar is like a very 378 00:16:30,300 --> 00:16:35,459 important but much more heavyweight 379 00:16:32,220 --> 00:16:37,860 package than n plus one because it is 380 00:16:35,459 --> 00:16:39,720 used for debugging the entire Django 381 00:16:37,860 --> 00:16:42,120 apps and it outlines much more 382 00:16:39,720 --> 00:16:44,820 information so it's like what Django 383 00:16:42,120 --> 00:16:46,259 version are using what SQL queries are 384 00:16:44,820 --> 00:16:49,320 being executed to get that information 385 00:16:46,259 --> 00:16:52,139 how many signals or signal handlers were 386 00:16:49,320 --> 00:16:55,560 executed so it provides much more 387 00:16:52,139 --> 00:16:57,779 information for us SQL queries are 388 00:16:55,560 --> 00:17:01,740 important in identifying and placement 389 00:16:57,779 --> 00:17:03,779 problems so if we uh see a lot of 390 00:17:01,740 --> 00:17:05,760 duplicate queries where the queries are 391 00:17:03,779 --> 00:17:07,559 same but these 392 00:17:05,760 --> 00:17:10,260 filtering criteria is changing so let's 393 00:17:07,559 --> 00:17:12,780 say we are only changing IDs so we are 394 00:17:10,260 --> 00:17:14,880 making 100 same uh 395 00:17:12,780 --> 00:17:16,860 SQL theory was only the ID is changing 396 00:17:14,880 --> 00:17:21,419 that indicates that there might be n 397 00:17:16,860 --> 00:17:24,419 plus one problem uh with that endpoint 398 00:17:21,419 --> 00:17:28,199 so like here's the output for the 399 00:17:24,419 --> 00:17:31,559 author's listing so Django debug toolbar 400 00:17:28,199 --> 00:17:33,480 indicates that okay 197 queries were 401 00:17:31,559 --> 00:17:34,559 executed they are 195 similar so it's 402 00:17:33,480 --> 00:17:36,720 telling their 403 00:17:34,559 --> 00:17:37,559 there are duplicate queries here and 404 00:17:36,720 --> 00:17:39,900 there 405 00:17:37,559 --> 00:17:42,900 and we can identify just looking there 406 00:17:39,900 --> 00:17:45,120 okay we are getting you the information 407 00:17:42,900 --> 00:17:47,640 ID is changing we are getting 408 00:17:45,120 --> 00:17:48,960 specialization the ID is changing so 409 00:17:47,640 --> 00:17:50,820 these duplicate queries are an 410 00:17:48,960 --> 00:17:53,660 indication that we are hitting DB more 411 00:17:50,820 --> 00:17:53,660 than we should 412 00:17:53,880 --> 00:17:58,500 same is the case for the books listing 413 00:17:56,100 --> 00:18:00,539 where like 50 queries and their 51 414 00:17:58,500 --> 00:18:01,919 similar and we can see the pattern here 415 00:18:00,539 --> 00:18:04,380 as well that we are getting user 416 00:18:01,919 --> 00:18:06,780 information or ID is changing category 417 00:18:04,380 --> 00:18:09,059 information same the query the same but 418 00:18:06,780 --> 00:18:12,000 only the ID is changing so it is a 419 00:18:09,059 --> 00:18:14,340 repeat uh queries that is uh indication 420 00:18:12,000 --> 00:18:17,480 that we there might be n plus one 421 00:18:14,340 --> 00:18:17,480 problem here 422 00:18:17,940 --> 00:18:24,960 so we can use both of these tools to 423 00:18:21,600 --> 00:18:28,200 to like see if they are n Plus One 424 00:18:24,960 --> 00:18:30,140 problems on our API uh but once we have 425 00:18:28,200 --> 00:18:32,940 identified the N Plus One problems how 426 00:18:30,140 --> 00:18:35,700 do we resolve those problems so that we 427 00:18:32,940 --> 00:18:39,360 are not hitting DB and 428 00:18:35,700 --> 00:18:41,700 for more than it's needed 429 00:18:39,360 --> 00:18:46,340 this is something that is provided by 430 00:18:41,700 --> 00:18:49,380 the Django or we can there are two uh 431 00:18:46,340 --> 00:18:53,780 methods that we can use on query sets 432 00:18:49,380 --> 00:18:53,780 select related and prefetch related 433 00:18:54,140 --> 00:19:00,480 select related is used to 434 00:18:57,900 --> 00:19:02,220 patch our foreign keys or one-to-one 435 00:19:00,480 --> 00:19:03,480 relationship when we are fetching the 436 00:19:02,220 --> 00:19:09,240 main object 437 00:19:03,480 --> 00:19:11,400 so uh it is a join on SQL level and get 438 00:19:09,240 --> 00:19:12,500 three main objects and the one to one 439 00:19:11,400 --> 00:19:14,820 then 440 00:19:12,500 --> 00:19:18,360 foreign key relationship in a single 441 00:19:14,820 --> 00:19:21,720 query and the neat thing about is uh is 442 00:19:18,360 --> 00:19:25,200 that we can also fetch the rated related 443 00:19:21,720 --> 00:19:28,080 objects uh in that same call as well so 444 00:19:25,200 --> 00:19:31,140 we can use one query to get the entity 445 00:19:28,080 --> 00:19:32,280 into information 446 00:19:31,140 --> 00:19:34,320 so 447 00:19:32,280 --> 00:19:36,559 previously we were doing book dot 448 00:19:34,320 --> 00:19:39,240 objects at all now we can change it to 449 00:19:36,559 --> 00:19:41,640 book.objects and then we can specify 450 00:19:39,240 --> 00:19:43,440 which relationship to fetch alongside 451 00:19:41,640 --> 00:19:44,120 books we can like fetch the author 452 00:19:43,440 --> 00:19:48,179 information 453 00:19:44,120 --> 00:19:50,460 authors user information book type 454 00:19:48,179 --> 00:19:52,080 and once we add that we can see a 455 00:19:50,460 --> 00:19:54,480 difference in the log of n plus one 456 00:19:52,080 --> 00:19:57,900 where previously 457 00:19:54,480 --> 00:20:01,320 uh it was telling that type and author 458 00:19:57,900 --> 00:20:04,380 have n plus one now it's just telling 459 00:20:01,320 --> 00:20:07,799 that text set as unfortunate problem so 460 00:20:04,380 --> 00:20:11,220 by using this simple uh method we have 461 00:20:07,799 --> 00:20:13,260 reduced the the N plus one to only one 462 00:20:11,220 --> 00:20:15,960 field 463 00:20:13,260 --> 00:20:20,700 and we can also see a reduction in our 464 00:20:15,960 --> 00:20:23,280 SQL care is that now uh there is one 465 00:20:20,700 --> 00:20:24,900 join operation that's getting the book 466 00:20:23,280 --> 00:20:26,460 information author information and 467 00:20:24,900 --> 00:20:28,440 author user information so all of that 468 00:20:26,460 --> 00:20:30,299 is happening in a single query 469 00:20:28,440 --> 00:20:34,860 there are still duplicate queries that 470 00:20:30,299 --> 00:20:37,380 will be resolved uh in a upcoming sites 471 00:20:34,860 --> 00:20:40,020 we can use the same select related for 472 00:20:37,380 --> 00:20:42,240 on our author listing as well 473 00:20:40,020 --> 00:20:45,000 and again there will be reduction in SQL 474 00:20:42,240 --> 00:20:47,820 queries here at 12. 475 00:20:45,000 --> 00:20:49,860 so where previously uh there were 53 476 00:20:47,820 --> 00:20:52,380 queries for book listing which select 477 00:20:49,860 --> 00:20:57,960 related that can't reduce to 23 for 478 00:20:52,380 --> 00:20:57,960 author we are down to one uh 39. 479 00:20:58,140 --> 00:21:04,919 prefetch related is like used to get our 480 00:21:02,100 --> 00:21:06,780 foreign reverse foreign key or money to 481 00:21:04,919 --> 00:21:09,600 many relationships 482 00:21:06,780 --> 00:21:10,980 the difference with prefetch related is 483 00:21:09,600 --> 00:21:13,919 that uh 484 00:21:10,980 --> 00:21:16,799 it prefetches the required object via 485 00:21:13,919 --> 00:21:19,020 SQL Theory and then performs the giant 486 00:21:16,799 --> 00:21:22,799 operation in Python 487 00:21:19,020 --> 00:21:26,460 so select related uh is doing the join 488 00:21:22,799 --> 00:21:28,140 in DB level prefatch related is doing a 489 00:21:26,460 --> 00:21:31,200 join on our 490 00:21:28,140 --> 00:21:32,820 python level so that's uh 491 00:21:31,200 --> 00:21:35,460 big difference here 492 00:21:32,820 --> 00:21:37,500 uh the idea is that since uh with many 493 00:21:35,460 --> 00:21:41,700 to many relationship we have an 494 00:21:37,500 --> 00:21:45,000 intermediate table so uh when we are 495 00:21:41,700 --> 00:21:48,419 joining three tables for forum like one 496 00:21:45,000 --> 00:21:51,299 man-to-man relationship the resultant 497 00:21:48,419 --> 00:21:53,940 table that is like obtained after all 498 00:21:51,299 --> 00:21:57,360 the Giants can become very huge and that 499 00:21:53,940 --> 00:22:00,059 can cause uh problem issues in itself 500 00:21:57,360 --> 00:22:04,320 so for that purpose uh the mind-to-many 501 00:22:00,059 --> 00:22:07,520 or money to one join is performed in the 502 00:22:04,320 --> 00:22:07,520 python code base 503 00:22:08,280 --> 00:22:15,059 so if we go back to our book listing we 504 00:22:12,240 --> 00:22:17,700 can change this operation so after 505 00:22:15,059 --> 00:22:19,980 selected we can prefatch related the 506 00:22:17,700 --> 00:22:21,480 tags information and like author 507 00:22:19,980 --> 00:22:24,539 specializations 508 00:22:21,480 --> 00:22:26,159 and the result of that is we are getting 509 00:22:24,539 --> 00:22:29,280 all the information and like five 510 00:22:26,159 --> 00:22:31,080 queries so where there were like 50 plus 511 00:22:29,280 --> 00:22:33,120 here is 100 plus Q is now we are getting 512 00:22:31,080 --> 00:22:35,280 that information in five queries 513 00:22:33,120 --> 00:22:36,480 so we can see the DraStic difference 514 00:22:35,280 --> 00:22:38,580 here 515 00:22:36,480 --> 00:22:40,860 uh 516 00:22:38,580 --> 00:22:43,620 the the thing I'll Focus here is that we 517 00:22:40,860 --> 00:22:46,220 we can see that we are fetching uh like 518 00:22:43,620 --> 00:22:48,600 tags or specialization against special 519 00:22:46,220 --> 00:22:50,280 or certain IDs 520 00:22:48,600 --> 00:22:51,900 so like these will be all the authors 521 00:22:50,280 --> 00:22:53,940 that we have fetched and only those 522 00:22:51,900 --> 00:22:56,100 specializations will be fetched and then 523 00:22:53,940 --> 00:22:58,020 python will take care python jungle will 524 00:22:56,100 --> 00:23:00,419 take care of the internal joint 525 00:22:58,020 --> 00:23:02,820 operation 526 00:23:00,419 --> 00:23:05,760 same is the case on the other listing we 527 00:23:02,820 --> 00:23:07,200 can uh chain all of the operations and 528 00:23:05,760 --> 00:23:10,320 then see the 529 00:23:07,200 --> 00:23:14,780 output of that that we're doing all of 530 00:23:10,320 --> 00:23:14,780 the fetching in seven queries 531 00:23:15,480 --> 00:23:21,480 so if we compare SQL queries before and 532 00:23:18,539 --> 00:23:23,520 after uh 533 00:23:21,480 --> 00:23:26,100 previously we were 534 00:23:23,520 --> 00:23:28,080 getting all the books in 53 queries with 535 00:23:26,100 --> 00:23:31,260 both select and prefatch related we are 536 00:23:28,080 --> 00:23:34,500 doing that operation with five queries 537 00:23:31,260 --> 00:23:38,039 on author one for we have gone from 197 538 00:23:34,500 --> 00:23:40,440 to 7. so that's a like pretty big 539 00:23:38,039 --> 00:23:43,260 reduction in SQL theories and how many 540 00:23:40,440 --> 00:23:45,539 DB round trips that we are making 541 00:23:43,260 --> 00:23:47,580 and this select and prefatchellated it 542 00:23:45,539 --> 00:23:50,159 is something that drf also recommends 543 00:23:47,580 --> 00:23:52,980 using to avoid n Plus One problems on 544 00:23:50,159 --> 00:23:53,880 both our serializer level and on view 545 00:23:52,980 --> 00:23:55,860 level 546 00:23:53,880 --> 00:23:57,900 uh 547 00:23:55,860 --> 00:24:00,299 the prefetching is something that drf 548 00:23:57,900 --> 00:24:02,400 does not performs on its own because 549 00:24:00,299 --> 00:24:04,559 like it is using curious as and curious 550 00:24:02,400 --> 00:24:07,380 as or lazy so that's why the 551 00:24:04,559 --> 00:24:11,400 recommendation by drf is that use both 552 00:24:07,380 --> 00:24:14,880 of these built-in tools and get the 553 00:24:11,400 --> 00:24:16,440 required related objects yourself 554 00:24:14,880 --> 00:24:18,600 so they have mentioned that you see 555 00:24:16,440 --> 00:24:20,340 programmer responsibilities to optimize 556 00:24:18,600 --> 00:24:23,520 those queries 557 00:24:20,340 --> 00:24:27,299 so again with plf it's really easy to 558 00:24:23,520 --> 00:24:30,179 create apis but uh with the and and plus 559 00:24:27,299 --> 00:24:31,860 one problem we can see a lot of 560 00:24:30,179 --> 00:24:34,980 performance issues when our application 561 00:24:31,860 --> 00:24:37,200 scale so it's better to identify those 562 00:24:34,980 --> 00:24:41,820 problem beforehand and resolve them 563 00:24:37,200 --> 00:24:41,820 using select or prefetch related 564 00:24:42,240 --> 00:24:49,620 okay we are at the end uh so n plus one 565 00:24:46,620 --> 00:24:54,120 is like a pretty common problem in both 566 00:24:49,620 --> 00:24:56,520 code Django views and drf uh views 567 00:24:54,120 --> 00:24:59,820 we would not know the performance impact 568 00:24:56,520 --> 00:25:02,460 of these issues like unless our 569 00:24:59,820 --> 00:25:04,679 underlying data hits a certain limit so 570 00:25:02,460 --> 00:25:08,039 let's say if we have limited number of 571 00:25:04,679 --> 00:25:09,659 objects both on production or local we 572 00:25:08,039 --> 00:25:11,780 might not know that something is wrong 573 00:25:09,659 --> 00:25:14,940 uh when it comes to fetching information 574 00:25:11,780 --> 00:25:16,380 but once the data increase we will see 575 00:25:14,940 --> 00:25:18,720 we'll start to see the performance 576 00:25:16,380 --> 00:25:21,240 implication or on our production 577 00:25:18,720 --> 00:25:23,940 so like we can use the tools like n plus 578 00:25:21,240 --> 00:25:27,200 one and debug Toolbar to identify these 579 00:25:23,940 --> 00:25:30,720 problems beforehand and fix them before 580 00:25:27,200 --> 00:25:33,240 this problem appears with the scale 581 00:25:30,720 --> 00:25:37,020 one thing to note is that that n plus 582 00:25:33,240 --> 00:25:41,039 one fix up scan results in complex joins 583 00:25:37,020 --> 00:25:44,220 uh on DB levels that can sometimes cause 584 00:25:41,039 --> 00:25:47,460 negative impacts on the performance 585 00:25:44,220 --> 00:25:49,799 so uh in the current example our 586 00:25:47,460 --> 00:25:53,159 relationship is quite simple but they 587 00:25:49,799 --> 00:25:56,640 will be uh reliable applications where 588 00:25:53,159 --> 00:25:57,900 we might have 20 plus one one to one or 589 00:25:56,640 --> 00:26:00,059 one to many or main to many 590 00:25:57,900 --> 00:26:02,100 relationships 591 00:26:00,059 --> 00:26:04,200 if we are prefetching all of that 592 00:26:02,100 --> 00:26:09,120 information in a single go 593 00:26:04,200 --> 00:26:11,100 the resultant join query in SQL might be 594 00:26:09,120 --> 00:26:15,600 inefficient or might not be using index 595 00:26:11,100 --> 00:26:17,760 so if your model is a bit complicated uh 596 00:26:15,600 --> 00:26:20,220 try a combination of Select and prefetch 597 00:26:17,760 --> 00:26:22,740 related to see which combination yields 598 00:26:20,220 --> 00:26:25,320 the best results for you uh you can 599 00:26:22,740 --> 00:26:29,159 maybe use uh caching on certain query 600 00:26:25,320 --> 00:26:31,500 sets to uh avoid hitting the database 601 00:26:29,159 --> 00:26:33,659 again and again 602 00:26:31,500 --> 00:26:37,260 so let's say for the book categories we 603 00:26:33,659 --> 00:26:39,600 can use some sort of cache to previous 604 00:26:37,260 --> 00:26:42,720 uh that information in the cache and use 605 00:26:39,600 --> 00:26:44,159 that cache information so click 606 00:26:42,720 --> 00:26:46,260 the bad 607 00:26:44,159 --> 00:26:49,200 try a bunch of combination see which 608 00:26:46,260 --> 00:26:52,980 yields the best results for you it will 609 00:26:49,200 --> 00:26:55,020 vary on case to case uh basis but in 610 00:26:52,980 --> 00:26:56,880 most cases using select related and 611 00:26:55,020 --> 00:26:58,159 prefatch related will resolve a lot of 612 00:26:56,880 --> 00:27:00,419 the 613 00:26:58,159 --> 00:27:03,120 like performance related issues on the 614 00:27:00,419 --> 00:27:04,320 RF APS 615 00:27:03,120 --> 00:27:07,200 uh 616 00:27:04,320 --> 00:27:10,440 I just talk was inspired from a similar 617 00:27:07,200 --> 00:27:13,500 talk at python Web Conference so it's 618 00:27:10,440 --> 00:27:15,900 just the talk title fighting and fixing 619 00:27:13,500 --> 00:27:18,360 and plus one problem in Django dogs uh 620 00:27:15,900 --> 00:27:21,840 you can like see that uh the recording 621 00:27:18,360 --> 00:27:26,100 of that stock on YouTube the slides for 622 00:27:21,840 --> 00:27:27,799 my talk or present on my GitHub same is 623 00:27:26,100 --> 00:27:31,860 the case for the code 624 00:27:27,799 --> 00:27:33,240 uh there are a bunch of the references 625 00:27:31,860 --> 00:27:34,200 for the different terms that we have 626 00:27:33,240 --> 00:27:37,140 talked about 627 00:27:34,200 --> 00:27:39,240 if you have any feedbacks suggestions 628 00:27:37,140 --> 00:27:41,520 comments like if you were looking for 629 00:27:39,240 --> 00:27:44,940 something that I did not add or if it 630 00:27:41,520 --> 00:27:48,659 was a bit too much please uh leave your 631 00:27:44,940 --> 00:27:51,120 feedback comment uh you can add the 632 00:27:48,659 --> 00:27:53,159 questions on Discord and I'll happy to 633 00:27:51,120 --> 00:27:56,039 answer them async 634 00:27:53,159 --> 00:27:59,179 uh thank you very very much and have a 635 00:27:56,039 --> 00:27:59,179 nice rest of the conference 636 00:27:59,710 --> 00:28:05,580 [Applause] 637 00:28:02,640 --> 00:28:08,760 I hope you can hear all that clapping 638 00:28:05,580 --> 00:28:11,520 yep excellent uh would you like to take 639 00:28:08,760 --> 00:28:13,460 one question from the audience we have 640 00:28:11,520 --> 00:28:17,100 time for one 641 00:28:13,460 --> 00:28:20,120 uh sure yeah okay if we have one 642 00:28:17,100 --> 00:28:20,120 we can do one 643 00:28:20,760 --> 00:28:25,620 okay it looks like most of the questions 644 00:28:22,799 --> 00:28:27,779 will be coming to you on Discord 645 00:28:25,620 --> 00:28:29,340 got it got it because even though your 646 00:28:27,779 --> 00:28:32,120 talk wasn't on asynchronous you're going 647 00:28:29,340 --> 00:28:32,120 to be asynchronous 648 00:28:32,480 --> 00:28:38,580 thank you so much 649 00:28:35,820 --> 00:28:42,630 we will clap again now foreign 650 00:28:38,580 --> 00:28:42,630 [Applause]