26 May 2022 |
paddenpaadje | I'm trying to do something like this. It should join and the table should be full but it's empty | 20:57:26 |
paddenpaadje | I have no idea what I'm doing wrong lol | 20:57:41 |
paddenpaadje | Download unknown.png | 20:58:50 |
Majestic#1066 | and addressDS is a Dataset[RawAddress] I guess | 21:02:28 |
Majestic#1066 | * and addressDS is a Dataset[RawAddress] I guess ? | 21:02:46 |
paddenpaadje | yes | 21:02:52 |
paddenpaadje | Download unknown.png | 21:03:28 |
paddenpaadje | also | 21:03:28 |
Majestic#1066 | Have you tried explicitly declaring the join condition? Like ds1.col("idLeft") = ds2.col("idRight") | 21:03:34 |
paddenpaadje | yes | 21:03:39 |
paddenpaadje | doesn't work | 21:03:42 |
Majestic#1066 | Are your IDs actually matching? | 21:06:16 |
Majestic#1066 | (by that I mean, yes it should join, so maybe the problem is on the data itself?) | 21:07:05 |
paddenpaadje | I've checked some records and there should be a lot of matching data | 21:07:20 |
paddenpaadje | Download unknown.png | 21:07:29 |
paddenpaadje | what I'm trying to achieve | 21:07:29 |
Majestic#1066 | You can check by collecting and comparing with Scala? | 21:13:32 |
paddenpaadje | I'll try to go back in the case class and change the types since one is type Long and other Option[Long] | 21:13:56 |
paddenpaadje | I've checked manually in csv files | 21:14:24 |
Majestic#1066 | Maybe I'm missing something, but I thought this should not matter too much | 21:16:10 |
paddenpaadje | I've thought that too, when I print the schema it says both are Long | 21:16:43 |
paddenpaadje | let me try it | 21:16:45 |
paddenpaadje | I mean I can't think of another thing that could be causing the issue | 21:17:07 |
Majestic#1066 | Yes on the dataframe it will just appear as Long nullable=true | 21:18:08 |
Majestic#1066 | The cast not going well, but that would be surprising | 21:18:42 |
paddenpaadje | lol it's not it | 21:26:54 |
paddenpaadje | tried it | 21:27:05 |
Majestic#1066 | You can go full "manual join" by doing 2 groupBy and a cogroup operation (and drop into debug there). Then you will see what matches and be understand what's going wrong | 21:33:49 |
Majestic#1066 | * You can go full "manual join" by doing 2 groupBy and a cogroup operation (and drop into debug there). Then you will see what matches and maybe understand what's going wrong | 21:34:23 |
paddenpaadje | thanks for the idea, will try it | 21:58:39 |