Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate TEXT fields from String #4713

Closed
nlarusstone opened this issue Sep 3, 2019 · 66 comments
Closed

Generate TEXT fields from String #4713

nlarusstone opened this issue Sep 3, 2019 · 66 comments
Labels
domain/schema Issue in the "Schema" domain: Prisma Schema, Introspection, Migrations etc. kind/feature A request for a new feature. topic: client types Types in Prisma Client topic: native database types

Comments

@nlarusstone
Copy link

Hi,

I’m using prisma2 lift to generate my database, but it’s generating all my String fields as VARCHAR with size 191. Unfortunately, when I go to populate these fields, some of my data has more than 191 characters. Is there a way I can get prisma2 lift to generate TEXT fields or specify my desired length of VARCHAR?

Thanks!

@nlarusstone
Copy link
Author

MacOS version: 10.13.6
Prisma Version: [email protected], binary version: c1f6f69
MySQL version: 5.7.14

@schickling
Copy link
Member

This should be a pretty trivial case. @matthewmueller can you help out here?

@robmurtagh
Copy link

Just FYI on this one, all of the docs (Data Modelling, MySQL Data Modelling) say that String should be mapped to TEXT.

I guess maybe VARCHAR allows the column to be used as an index?

This one is a big issue for us.

@matthewmueller
Copy link
Contributor

matthewmueller commented Sep 4, 2019

@mavilein
Copy link
Contributor

mavilein commented Sep 6, 2019

The comment in the code explains why we are using this type right now:

// we use varchar right now as mediumtext doesn't allow default values
// a bigger length would not allow to use such a column as primary key

@matthewmueller : Let's discuss in person how we should handle this next week.

@mavilein mavilein assigned matthewmueller and unassigned mavilein Sep 9, 2019
@mavilein
Copy link
Contributor

mavilein commented Sep 9, 2019

Talked with @matthewmueller. He will write an update how we could tackle this.

@matthewmueller
Copy link
Contributor

matthewmueller commented Sep 9, 2019

Thanks @mavilein. Quick brain dump in why this is not as easy as changing one line of code. Given the following example:

model User {
  id String @id
  email String @unique
  bio String
}

If we do TEXT, then id and email won't work because you cannot have a unique index on TEXT in MySQL.

If we do VARCHAR(191), we can't have a bio that's longer than 191 characters. If you're curious why 191, this blog post is great.

The naive approach would be that whenever we see @unique or @id, we change the type. This may lead to migrations issues when adding these attributes, leading to underlying column type unexpectedly changing.

We're going to keep thinking about this but if you have any ideas, let us know!

@nlarusstone
Copy link
Author

Thanks for the responses @mavilein and @matthewmueller!

Maybe a better solution doesn't involve handling this implicitly, but rather requires the user to explicitly specify that they want to use a TEXT type. The syntax would therefore be something like:

model User {
  id   Int    @id
  name String @ms.text
}

You could prevent any annotations of @unique or @id from being applied to fields with that annotation.

Seems related to #335 as well

For us, this is a very basic/important usecase, so any help you can provide would be greatly appreciated!

@schickling
Copy link
Member

@matthewmueller an alternative could be that we acknowledge this as a very common use case and introduce both a String and Text type as Prisma scalar types?

@matthewmueller
Copy link
Contributor

matthewmueller commented Sep 11, 2019

@schickling that's how I'm leaning as well, since the alternatives seem sketchy. But I am worried about a datatype explosion, making it harder for generators to do their job. We might just need to acknowledge this and deal with it.

Actually, perhaps generators could have a different mapping (more restrictive) than lift.

@matthewmueller
Copy link
Contributor

matthewmueller commented Sep 19, 2019

My proposal here is that given the following SQL:

create extension if not exists citext;

create table if not exists blogs (
  id serial primary key not null,
  website text not null unique,
  created_at timestamp not null default now()
);

create table if not exists users (
  id serial primary key not null,
  email citext not null unique,
  location point not null,
  first_name text not null,
  last_name text not null,
  created_at timestamp not null default now()
);

create table if not exists posts (
  id serial primary key not null,
  blog_id int references blogs (id) on delete cascade on update cascade,
  author_id int references users (id) on delete cascade on update cascade,
  title text not null,
  created_at timestamp not null default now()
);

create table if not exists comments (
  id serial primary key not null,
  post_id int references posts (id) on delete cascade on update cascade,
  comment text not null,
  created_at timestamp not null default now()
);

We generate a schema that looks like this:

generator photon {
  provider = "photonjs"
}

datasource pg {
  provider = "postgresql"
  url      = "postgresql://m@localhost:5432/prisma-blog?schema=public"
}

model blogs {
  id         pg.serial      @id
  created_at pg.timestamp
  posts      posts[]
  website    pg.text   @unique
}

model comments {
  id         pg.serial      @id
  comment    pg.text
  created_at pg.timestamp
  post_id    posts?
}

model migrate {
  version pg.serial @id
}

model posts {
  id         pg.serial        @id
  author_id  users?
  blog_id    blogs?
  comments   comments[]
  created_at pg.timestamp
  title      pg.text
}

model users {
  id         pg.serial      @id
  created_at pg.timestamp
  email      pg.citext
  location   pg.point
  first_name pg.text
  last_name  pg.text
  posts      posts[]
}

Connectors will still know how to map to pg.text to String during photon generation.

cc/ @schickling @sorenbs

@mavilein
Copy link
Contributor

mavilein commented Sep 20, 2019

@matthewmueller : I don't think this is the way to go here. I see 2 major problems with this approach:

  1. It becomes very hard to port a Prisma schema from one datasource to the other. Imagine finding a tutorial online for Postgres and you want to use MySQL. Now you can have to remap all those fields.
  2. This approach has the implicit assumption that one field of a model always interacts with exactly one datasource. This is a very limiting in a scenario with multiple databases:
    • Imagine a @cache(mongo) annotation on a model that instructs photon to maintain a read only cache in a MongoDB. Now the MongoDB connector needs to understand all custom types.
    • We plan to implement a Change Data Capture System for the prisma framework. This system would suddenly need to be able to understand all custom types as well.

@matthewmueller
Copy link
Contributor

matthewmueller commented Sep 20, 2019

For 1. in our capability map, we'd have in the Postgres connector pg.text => String and in the Mongo connector mongo.string => String. It should be possible to automatically remap with about as good of accuracy as switching the underlying datasource and hoping that String matches String.

For 2. Yes, multiple datasources in a single model would be possible in this world:

model users {
  id         pg.serial      @id
  created_at pg.timestamp
  email      pg.citext
  location   pg.point
  first_name mongo.string
  last_name  mongo.string
  posts      posts[]
}

I think the @@cache(mongo) would be too, we know all the type mappings.

@mavilein
Copy link
Contributor

We discussed a solution for this. On MySQL we currently map the String core data type to varchar(191). We do this because there's a restriction on the maximum size of a column so it can be part of a (unique) index. Details on this limitation can be found here.
There's a different solution though. We will start to map the String type to Text as specified in the Prisma Schema Language spec. To work around the limitation we will instead configure the index to just consider the first 191 characters of a text column. This could lead to false duplicates when inserting two records with the same first 191 characters but we think this is an acceptable trade off. You are exposed to this limitation also when using MySQL directly.

Hint: The number 191 stems from the fact that index key prefix length in MySQL is 767 bytes. We are using utf8_mb4 encoding on MySQL which uses 4 bytes per character. 191 * 4 = 764

@matthewmueller matthewmueller removed their assignment Sep 26, 2019
@tomhoule
Copy link
Contributor

tomhoule commented Sep 27, 2019

I just started working on this. I immediately hit the default values issue. Note: text columns can have a default value starting with MySQL 8.0 - I haven't looked very deep into this yet but it looks like there is no straightforward way to do this with older versions of MySQL, and prisma 1 handles this by doing the default value generation itself instead of leaving it to the database (?).

Reference: https://dev.mysql.com/doc/refman/8.0/en/data-type-defaults.html

@nlarusstone
Copy link
Author

Hi @mavilein,

Thanks for the update -- the reasoning makes sense and that solution would work for us!

@johannesschobel
Copy link
Contributor

Dear @mavilein ,

wouldn't it be a good approach to add MORE predefined datatypes? i.e., not only have String but also Text, SmallText, MediumText, LargeText and even JSON and so on? (see this issue here: #446 )

I would suggest to have different Grammars (as in Laravels ORM) that map the Prisma Datatypes to their appropriate Database Datatypes. For example, the SQLite Grammar may map the JSON datatype to a Text, whereas the Postgres Grammar map the JSON datatype to JSON.

I added some examples (including code-references) to the linked issue. Maybe this can work out here as well!?

In addition, we could rely on the datasource.provider (i.e., postgresql) to load respective grammar. What do you think?

All the best

@robmurtagh
Copy link

I think that sounds like a very good idea @johannesschobel, as there will be similar issues for the mapping from almost any Lift data type to its Persistent (DB) data type.

Like you say, JSON support e.g. for MySQL would be amazing (we're really missing it), and seems to fall into a very similar category of feature.

@mavilein
Copy link
Contributor

@johannesschobel @robmurtagh : We are currently actively working on specing out Prismas core datatypes and their mappings to the underlying database. Your suggestions are among the ones we have on the table right now 👍 .

@johannesschobel
Copy link
Contributor

That is good to hear! If you have a dummy grammar available to peek, I can surely work on the PostgreSQL grammar

@nlarusstone
Copy link
Author

One thing I'm curious/worried about is the performance implications of this change. I've had bad experiences with slow TEXT fields in the past, so I think it's unlikely that I would want all of my String fields to be TEXT (I'd like to use VARCHAR where possible).

This seems like it would be addressed by @johannesschobel 's suggestion, however.

@johannesschobel
Copy link
Contributor

Dear @nlarusstone ,
yeah, that would be certainly "solved" as you (as the developer) could choose the most appropriate datatype (i.e., Text or String).

I am certainly aware, that the Grammar approach proposed by me is probably the most complex and "costly" one to implement - however, i think, it is also the most versatile and easiest to go on with / built up onto.

@mavilein
Copy link
Contributor

mavilein commented Oct 1, 2019

@nlarusstone : Can you elaborate on the some cases where TEXT fields were slow for you? 🙏

@divyenduz
Copy link
Contributor

Internal note: dropping from current sprint

@ChristophDietrich
Copy link

Any updates on this? Seeing it was dropped multiple times from the current sprin by divyenduz

The possibility to add TEXT fields on schema is necessarily for all bigger projects.

Please push this topic. Thank you!

PS: Need help? Like to work on this topic to get this integration done.

@AnandChowdhary
Copy link

Bumping this because it's been open for a while, and it's pretty important. Json data type is now supported which creates JSON => LONGTEXT in MySQL, so the next logical step would be to allow longer fields.

It's troublesome because LONGTEXT in SQL maps to String when introspecting, but there's no way for the opposite, so our databases using the same Prisma schema turn out inconsistent.

@cannikin
Copy link

Bumping! Also will this take into account if you wanted to set a custom character limit? Is it maybe a new @ directive?

firstName  String  @length(64)
bio        String  @length(text)

Maybe?

@thedavidprice
Copy link

Checking in from the RedwoodJS community (along with Rob above). We're starting to see more conversation about this amongst the community -- it's moving up in priority as a request + need.

If there's anything we can do to help, please don't hesitate to ask. Thanks again, Team Prisma!

@AnandChowdhary
Copy link

Just a quick sidenote: It's a bit weird that Prisma is on v2.2.0 and this issue has been open since the betas. Maybe it'd be a good idea to make a milestone for features that are extremely important for any production use case, like a custom length of field or defaults in JSON (#2556), that should've been there in the stable v2.0.0 but are missing. Perhaps still being in beta would've been better?

@maoosi
Copy link

maoosi commented Jul 13, 2020

Definitely a priority for my team as well! There are a lot of other issues pointing to this same topic and my understanding is that it should be resolved as part of #446

It would be awesome if this could be pushed as a priority for the next release.

@ChristophDietrich
Copy link

It looks like it will never come and taking not very seriously.

Other less important topics are more important for them, really disappointed for myself.

Fixing bugs / problems, should be the highest priority on working.

This topic blocks all my important projects now.

My clients getting very uncomfortable with this new technology and thinking about to spent a lot of time and money to rewrite everything into other db adapter ...

Please @janpio can you push this internally?!?

@thebiglabasky
Copy link

Hi everyone,
We understand the frustration. We're looking at solving this through native type support and started conversations around how we could go about this. This is key for us to solve this right. This also comes with quite a few implications on the entire product (introspection, schema, client and migrations) to ensure we not only solve this specific issue, but make it possible to have sustainable types support once we open up to other database vendors.
We'll come back once the epic created to track that here: #446
Thanks for your patience!

@nikolasburk
Copy link
Member

nikolasburk commented Jul 13, 2020

Hey @ChristophDietrich, @maoosi and others! Thanks a lot for sharing your thoughts and your activity here, this definitely helps us prioritize this feature better! As Hervé pointed out that this change comes with a number of implications, so it's not as trivial and quick to solve.

However, I just want to point out that you can totally use any native DB types with Prisma (just not with Prisma Migrate, yet)! Taking the example from this issue of changing VARCHAR(191) to TEXT.

Assume you have the following Prisma model:

model User {
  id   Int    @id @default(autoincrement())
  name String
}

Prisma Migrate creates the following SQL table for you when running save and up:

CREATE TABLE `User` (
  `id` INT(11) NOT NULL AUTO_INCREMENT,
  `name` VARCHAR(191) COLLATE utf8mb4_unicode_ci NOT NULL,
  PRIMARY KEY (`id`)
);

You can now manually change the the type of the name column from VARCHAR(191) to TEXT as follows:

ALTER TABLE `User` MODIFY `name` TEXT NOT NULL;

Once you've done this, you can just keep using Prisma Client as before!

@ChristophDietrich
Copy link

@nikolasburk
Thanks for sharing, manually is not very good.

Sure i can change everything manually ... but when working with dev, stage and production and when you don't have permission to one of the latest system, you are not having a good deployment.

No one likes to have a separate sql file to take care of and handling the update process by themselves.

What about to introduce a possibility to add column type changes like you show here with ALTER Table in the "prisma migration" process unless Prisma team is not working on this topic in "full" detail?

@mavilein
Copy link
Contributor

mavilein commented Aug 4, 2020

@ChristophDietrich : We are working currently very actively on Prisma Migrate. This can also be seen on our public roadmap. So this restriction will get lifted soonish.

@tomhoule
Copy link
Contributor

tomhoule commented Nov 5, 2020

Tracking issue for Text in the prisma schema: prisma/prisma-engines#72

@tomhoule tomhoule transferred this issue from prisma/migrate Dec 18, 2020
@tomhoule tomhoule added kind/feature A request for a new feature. domain/schema Issue in the "Schema" domain: Prisma Schema, Introspection, Migrations etc. topic: client types Types in Prisma Client topic: native database types labels Dec 18, 2020
@sohailhaider
Copy link

sohailhaider commented Jan 11, 2021

Any update on this guys? Like is spec ready?

@thebiglabasky
Copy link

Hi @sohailhaider
We released a preview of extended native types support which we're working on integrating with migrate this quarter. This should allow to have more control over this.

@albertoperdomo
Copy link
Contributor

Prisma Migrate (in Preview) now supports native types (in Preview) starting from 2.15.0.

You should be able to customize your String fields to use different underlying columns types, e.g. varchar(50), text, etc. Docs are available here.

Please share your feedback here. Closing.

@calidion
Copy link

it seems unacceptable to change field length for existing data.

You are about to alter the column `cover` on the `thread` table, which contains 264 non-null values. The data in that column will be cast from `VarChar(255)` to `VarChar(191)`

@tomhoule
Copy link
Contributor

@calidion can you elaborate?

@richellyitalo
Copy link

model MyModel {
 ...
 field String @db.Text
 ...
}

@albertoperdomo albertoperdomo removed their assignment May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain/schema Issue in the "Schema" domain: Prisma Schema, Introspection, Migrations etc. kind/feature A request for a new feature. topic: client types Types in Prisma Client topic: native database types
Projects
None yet
Development

No branches or pull requests