Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arraytable (objecttable) fails when the argument is a nested table #22

Open
sprmnt21 opened this issue May 6, 2022 · 2 comments
Open

Comments

@sprmnt21
Copy link

sprmnt21 commented May 6, 2022

I wanted to see the json file corresponding to a dataframe that has dataframes as elements in one of the columns.

3×3 DataFrame
 Row │ f       oc     sdf
     │ String  Int64  DataFrame
─────┼──────────────────────────────
   1 │ f1          1  3×2 DataFrame
   2 │ f2          2  2×2 DataFrame
   3 │ f3          3  3×2 DataFrame

I have previously verified that

julia> Tables.istable(dfn)
true

but I get this result

julia> objecttable(dfn)
ERROR: ArgumentError: DataFrame doesn't have a defined `StructTypes.StructType`
Stacktrace:
...

I wonder if it is possible to manage this type of structures

if i use jsontable on this string (http://bl.ocks.org/nautat/4085017)

jdata = "[
    {
        \"name\":\"bob\",
        \"salary\":13000,
        \"friends\":[
            {
                \"name\": \"sarah\",
                \"salary\":10000
            },
            {
                \"name\": \"bill\",
                \"salary\":5000
            }
        ]
    },
    {
        \"name\":\"marge\",
        \"salary\":10000,
        \"friends\":[
            {
                \"name\": \"rhonda\",
                \"salary\":10000
            },
            {
                \"name\": \"mike\",
                \"salary\":5000,
                \"hobbies\":[
                    {
                        \"name\":\"surfing\",
                        \"frequency\":10
                    },
                    {
                        \"name\":\"surfing\",
                        \"frequency\":15
                    }
                ]
            }
        ]
    },
    {
        \"name\":\"joe\",
        \"salary\":10000,
        \"friends\":[
            {
                \"name\": \"harry\",
                \"salary\":10000
            },
            {
                \"name\": \"sally\",
                \"salary\":5000
            }
        ]
    }
]"

and then i try to get the dataframe

julia> jsontable(jdata)
JSONTables.Table{false, JSON3.Array{JSON3.Object, Base.CodeUnits{UInt8, String}, Vector{UInt64}}}([:name, :salary, :friends], Dict{Symbol, Type}(:name => String, :salary => Int64, :friends => JSON3.Array{JSON3.Object, Base.CodeUnits{UInt8, String}, SubArray{UInt64, 1, Vector{UInt64}, Tuple{UnitRange{Int64}}, true}}), JSON3.Object[{
...



julia> DataFrame(jsontable(jdata))
3×3 DataFrame
 Row │ name    salary  friends
     │ String  Int64   Array…
─────┼───────────────────────────────────────────────────
   1 │ bob      13000  JSON3.Object[{\n     "name": "sa…
   2 │ marge    10000  JSON3.Object[{\n     "name": "rh…
   3 │ joe      10000  JSON3.Object[{\n     "name": "ha…
@quinnj
Copy link
Member

quinnj commented May 8, 2022

Yeah, it's true that the machinery in JSONTables.jl isn't set up very well right now for the nested table case. We'll have to think through what makes sense here; maybe we try to apply the sink recursively?

@sprmnt21
Copy link
Author

sprmnt21 commented May 9, 2022

... or put a kwargs to choose to flatten everything.
By obtaining, in the example considered, such a thing

julia> CSV.read("tabintabm.csv", DataFrame)
7×6 DataFrame
 Row │ name     salary  friends.name  friends.salary  friends.hobbies.name  friends.hobbies.frequency      
     │ String7  Int64   String7       Int64           String7?              Union{Missing, Int64}
─────┼────────────────────────────────────────────────────────────────────────────────────────────────     
   1 │ bob       13000  sarah                  10000  missing                                 missing      
   2 │ bob       13000  bill                    5000  missing                                 missing      
   3 │ marge     10000  rhonda                 10000  missing                                 missing      
   4 │ marge     10000  mike                    5000  surfing                                      10      
   5 │ marge     10000  mike                    5000  surfing                                      15      
   6 │ joe       10000  harry                  10000  missing                                 missing      
   7 │ joe       10000  sally                   5000  missing                                 missing      

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants